Projects Alphabetical Index
An alphabetical list of all projects indexed on this site.
Apache .NET Ant Library
This is a library of Ant tasks that help developing
.NET software. It includes the "old" .NET tasks like a C# compiler task but also comes with support for NUnit testing or running the popular NAnt or MSBuild build tools.
When assembling software out of reusable components, the task of deploying software onto an ever increasing number of targets is not trivial to solve. This becomes even harder when these targets require different components based on who's using them.
Apache ACE allows you to group those components and assign them to a managed set of targets. This allows you to distribute updates and new components easily, while keeping a full history of what was installed where during what period. It also helps you setup an automated development, QA/testing, staging and production environment.
The goal of the Apache Abdera project is to build a functionally-complete, high-performance implementation of the IETF Atom Syndication Format (RFC 4287) and Atom Publishing Protocol (RFC 5023) specifications.
The Apache Accumulo sorted, distributed key/value store is based on Google's BigTable design. It is built on top of Apache Hadoop, Zookeeper, and Thrift. It features a few novel improvements on the BigTable design in the form of cell-level access labels and a server-side programming mechanism that can modify key/value pairs at various points in the data management process.
ActiveMQ is a fast and powerful Message Broker which supports many Cross Language Clients and Protocols and many advanced features while fully supporting JMS 1.1 and J2EE 1.4.
Apache Airavata is a micro-service architecture based software framework for
executing and managing computational jobs and workflows on distributed computing
resources including local clusters, supercomputers, national grids, academic and
commercial clouds. Airavata is dominantly used to build Web-based science gateways and
assist to compose, manage, execute, and monitor large scale applications (wrapped as Web
services) and workflows composed of these services.
Apache Allura is an open source implementation of a software "forge", a web site that manages source code repositories, bug reports, discussions, wiki pages, blogs and more for any number of individual projects.
Apache Ambari makes Hadoop cluster provisioning, managing, and monitoring dead simple.
Anakia is an XML transformation tool that uses JDOM and Velocity to transform XML documents into the format of your choice. It provides an alternative to using Ant's <style> task and XSL to process XML files.
Apache Ant is a Java-based build tool.
The Ant Library provides Ant tasks for testing Ant
task, it can also be used to drive functional and integration tests
of arbitrary applications with Ant.
Apache Any23 is used in major Web of Data applications. It is written in Java and licensed under the Apache License v2.0. Apache Any23 can be used in various ways:
* As a library in Java applications that consume structured data from the Web.
* As a command-line tool for extracting and converting between the supported formats.
* As online service API available at any23.org.
Archiva is the perfect companion for build tools such as Maven, Continuum, and ANT. Archiva offers several capabilities, amongst which remote repository proxying,
security access management, build artifact storage, delivery, browsing, indexing and usage reporting, extensible scanning functionality and many more!
The Aries project is delivering a set of pluggable Java components enabling an enterprise OSGi application programming model. This includes implementations and extensions of application-focused specifications defined by the OSGi Alliance Enterprise Expert Group (EEG) and an assembly format for multi-bundle applications, for deployment to a variety of OSGi based runtimes.
Apache Avro is a data serialization system.
Apache Axiom is a StAX-based, XML Infoset compliant object model which supports on-demand building of the object tree. It supports a novel "pull-through" model which allows one to turn off the tree building and directly access the underlying pull event stream. It also has built in support for XML Optimized Packaging (XOP) and MTOM, the combination of which allows XML to carry binary data efficiently and in a transparent manner. The combination of these is an easy to use API with a very high performant architecture!
Apache Axis2 is a toolkit for creating and using Web Services, including SOAP, MTOM, XML/HTTP and advanced WS-* standards such as WSRM and WSSecurity.
Axis2 includes a very fast runtime engine, together with tooling support for WSDL and WS-Policy, and plugin support for WS-Addressing, WS-ReliableMessaging, WS-Security,
WS-Eventing, WS-Transactions, WS-Trust and WS-SecureConversation.
Axis2 runs either standalone or hosted in Tomcat or other servlet containers.
The goal of the Apache BVal project is to deliver an implementation of the Java Bean Validation Specfication (JSR303) which is TCK compliant and works on Java SE 5 or later. The initial codebase for the project was donated to the ASF by a SGA from Agimatec GmbH and uses the Apache Software License v2.0.
Batik is a Java-based toolkit for applications or applets to use images in the Scalable Vector Graphics (SVG) format for various purposes, such as display, generation and manipulation.
Our goal is to make J2EE programming easier by building a simple object model on J2EE and Struts. Using Java 5 annotations, Beehive reduces the coding necessary for J2EE. The initial Beehive project has three pieces.
NetUI: An annotation-driven web application programming framework that is built atop Struts. NetUI centralizes navigation logic, state, metadata, and exception handling in a single encapsulated and reusable Page Flow Controller class. In addition, NetUI provides a set of JSP tags for rendering HTML / XHTML and higher-level UI constructs such as data grids and trees and has first-class integration with JavaServer Faces and Struts.
Controls: A lightweight, metadata-driven component framework that reduces the complexity of being a client of enterprise resources. Controls provide a unified client abstraction that can be implemented to access a diverse set of enterprise resources using a single configuration model.
Web Service Metadata (WSM): An implementation of JSR 181 which standardizes a simplified, annotation-driven model for building Java web services.
In addition, Beehive includes a set of system controls that are abstractions for low-level J2EE resource APIs such as EJB, JMS, JDBC, and web services.
Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem.
The primary goal of Bigtop is to build a community around the packaging and interoperability
testing of Hadoop-related projects. This includes testing at various levels (packaging, platform,
runtime, upgrade, etc...) developed by a community with a focus on the system as a whole, rather
than individual projects. In short we strive to be for Hadoop what Debian is to Linux.
Apache Bloodhound has been created to be an open source collaboration tool to track the progress of and help distribute tasks within a project. With a particular focus on software development it includes integration with popular source control software including Apache Subversion, Git and Mercurial.
BookKeeper is a reliable replicated log service. It can be used to turn any standalone service into a highly available replicated service. BookKeeper is highly available (no single point of failure), and scales horizontally as more storage nodes are added.
We wanted something that's simple and intuitive to use, so we only need to tell it what to do, and it takes care of the rest. But also something we can easily extend for those one-off tasks, with a language that's a joy to use.
Apache CXF is an open source services framework. CXF helps you build and develop services using frontend programming APIs like JAX-WS and JAX-RS. These services can speak a variety of protocols such as SOAP, XML/HTTP, RESTful HTTP, or CORBA and work over a variety of transports such as HTTP, JMS or JBI.
Apache Camel is a powerful open source integration framework based on known Enterprise Integration Patterns.
Rules for Camel's routing and mediation engine can be defined in either a Java based DSL, XML or using DSLs for dynamic languages such as Groovy or Scala.
PMC: Apache Camel
Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make Apache Cassandra the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class.
Cassandra is in use at Netflix, Twitter, Urban Airship, Constant Contact, Reddit, Cisco, OpenX, Digg, CloudKick, Ooyala, and more companies that have large, active data sets.
Cassandra provides full Hadoop integration, including with Pig and Hive.
Cayenne is a powerful, full-featured, opensource framework created for developers working with relational databases. it seamlessly maps any relational database to Java objects, reducing development time and adding considerable functionality to any application which requires a database. Developers using Cayenne will be able to concentrate on the core business requirements and the data model instead of the SQL details. The application can then be easily moved to any JDBC-capable database. In addition to management of persistent Java objects mapped to relational databases, Cayenne provides a plethora of features including single method call queries and updates (including atomic updates of all modified objects), seamless integration of multiple databases into a single virtual data source, three tier persistence with caching on the remote client, paging of results, record locking, and many more features.
Celix is an implementation of the OSGi specification adapted to C.
It will follow the API as close as possible, but since the OSGi specification is written primarily for Java, there will be differences (Java is OO, C is procedural).
An important aspect of the implementation is interoperability between Java and C. This interoperability is achieved by porting and implementing the Remote Services specification in Celix.
Apache Chainsaw is a GUI log viewer.
Apache Chemistry provides open source implementations of the Content Management Interoperability Services (CMIS) specification. Libraries are available for Java, Python, PHP and .NET.
Chukwa is an open source data collection system for monitoring
large distributed systems. Chukwa is built on top of
the Hadoop Distributed File System (HDFS) and Map/Reduce framework
and inherits Hadoop’s scalability and robustness. Chukwa also includes
a ﬂexible and powerful toolkit for displaying, monitoring and analyzing
results to make the best use of the collected data.
Clerezza allows to easily develop semantic web applications by providing tools to manipulate RDF data, create RESTful Web Services and Renderlets using ScalaServerPages. Contents are stored as triples based on W3C RDF specification. These triples are stored via Clerezza’s Smart Content Binding (SCB). SCB defines a technology-agnostic layer to access and modify triple stores. It provides a java implementation of the graph data model specified by W3C RDF and functionalities to operate on that data model. SCB offers a service interface to access multiple named graphs and it can use various providers to manage RDF graphs in a technology specific manner, e.g., using Jena or Sesame. It also provides for adaptors that allow an application to use various APIs (including the Jena api) to process RDF graphs. Furthermore, SCB offers a serialization and a parsing service to convert a graph into a certain representation (format) and vice versa.
Apache CloudStack is open source software designed to deploy and manage large
networks of virtual machines, as a highly available, highly scalable Infrastructure as
a Service (IaaS) cloud computing platform. CloudStack is used by a number of service
providers to offer public cloud services, and by many companies to provide an
on-premises (private) cloud offering, or as part of a hybrid cloud solution.
CloudStack is a turnkey solution that includes the entire "stack" of features most
organizations want with an IaaS cloud: compute orchestration, Network-as-a-Service,
user and account management, a full and open native API, resource accounting, and a
first-class User Interface (UI).
CloudStack currently supports the most popular hypervisors: VMware, KVM, XenServer and
Xen Cloud Platform (XCP).
Users can manage their cloud with an easy to use Web interface, command line tools, and
/ or a full-featured RESTful API. In addition, CloudStack provides an API that's
compatible with AWS EC2 and S3 for organizations that wish to deploy hybrid clouds.
Apache Cocoon is a web development framework built around the concepts of separation of concerns (making sure people can interact and collaborate on a project, without stepping on each other toes) and component-based web development. Cocoon implements these concepts around the notion of "component pipelines", each component on the pipeline specializing on a particular operation. This makes it possible to use a "building block" approach for web solutions, hooking together components into pipelines without any required programming.
Apache Commons BCEL
The Byte Code Engineering Library is intended to give users a convenient possibility to analyze, create, and manipulate (binary) Java class files (those ending with .class). Classes are represented by objects which contain all the symbolic information of the given class: methods, fields and byte code instructions, in particular.
Apache Commons BSF
Bean Scripting Framework (BSF) is a set of Java classes which provides scripting language support within Java applications, and access to Java objects and methods from scripting languages. BSF allows one to write JSPs in languages other than Java while providing access to the Java class library. In addition, BSF permits any Java application to be implemented in part (or dynamically extended) by a language that is embedded within it. This is achieved by providing an API that permits calling scripting language engines from within Java, as well as an object registry that exposes Java objects to these scripting language engines.
Apache Commons BeanUtils
BeanUtils provides an easy-to-use but flexible wrapper around reflection and introspection.
Apache Commons CLI
Commons CLI provides a simple API for presenting, proecessing and
validating a command line interface.
Apache Commons Chain
An implmentation of the GoF Chain of Responsibility pattern
Apache Commons Codec
The codec package contains simple encoder and decoders for
various formats such as Base64 and Hexadecimal. In addition to these
widely used encoders and decoders, the codec package also maintains a
collection of phonetic encoding utilities.
Apache Commons Collections
Types that extend and augment the Java Collections Framework.
Apache Commons Compress
Commons Compress: working with zip, ar, jar, bz2, cpio, tar, gz, dump, pack200, lzma, 7z, arj and xz files.
Apache Commons Configuration
Library to use configuration/preferences of various sources and formats.
Apache Commons DBCP
Commons Database Connection Pooling
Apache Commons Daemon
Apache Commons DbUtils
A package of Java utility classes for easing JDBC development
Apache Commons Digester
The Digester package lets you configure an XML->Java object mapping module
which triggers certain actions called rules whenever a particular
pattern of nested XML elements is recognized.
Apache Commons Discovery
Apache Commons EL
JSP 2.0 Expression Language Interpreter Implementation
Apache Commons Email
Commons-Email aims to provide a API for sending email.
It is built on top of the Java Mail API, which it aims to simplify.
Apache Commons Exec
A library to reliably execute external processes from within the JVM
Apache Commons FileUpload
The FileUpload component provides a simple yet flexible means of adding
support for multipart file upload functionality to servlets and web
Apache Commons Functor
The Apache Commons Functor library defines common functor and functor-related interfaces,
implementations, and utilities.
Apache Commons HttpClient
Commons HttpClient is a library for client-side HTTP communication.
It provides support for HTTP/1.1 and HTTP/1.0, plus
various authentication schemes and cookie policies.
Thanks to it's widespread use and years of development, it is a very
mature and stable codebase. However, due to limitations in the API design,
Commons HttpClient will eventually be replaced by HttpClient 4.0
with a completely redesigned API based on HttpCore.
Apache Commons IO
Commons-IO contains utility classes, stream implementations, file filters, file comparators and endian classes.
Apache Commons JCI
Commons JCI provides a unified interface to any of several Java compilers.
Apache Commons JCS
Comprehensive Caching System
Apache Commons JEXL
Jexl is an implementation of the JSTL Expression Language with extensions.
Apache Commons JXPath
A Java-based implementation of XPath 1.0 that, in addition to XML processing, can inspect/modify Java object graphs (the library's explicit purpose) and even mixed Java/XML structures.
Apache Commons Jelly
Jelly is a Java and XML based scripting engine. Jelly combines the best ideas from JSTL, Velocity, DVSL, Ant and Cocoon all together in a simple yet powerful scripting engine.
Apache Commons Lang
Commons Lang, a package of Java utility classes for the
classes that are in java.lang's hierarchy, or are considered to be so
standard as to justify existence in java.lang.
Apache Commons Launcher
Launcher are a set of Java classes which aim at making a cross
platform Java application launcher.
Apache Commons Logging
Commons Logging is a thin adapter allowing configurable bridging to other,
well known logging systems.
Apache Commons Math
The Math project is a library of lightweight, self-contained mathematics and statistics components addressing the most common practical problems not immediately available in the Java programming language or commons-lang.
Apache Commons Modeler
Apache Commons Net
Apache Commons OGNL
The Apache Commons OGNL library is a Java development framework for Object-Graph Navigation Language,
plus other extras such as list projection and selection and lambda expressions.
Apache Commons Pool
Commons Object Pooling Library
Apache Commons Primitives
Commons Primitives is a set of collection and utility classes for primitive types.
The Java language has a clear distinction between Object and primitive types.
A lot of functionality is provided for Object types, including the Java Collection Framework.
Relatively little functionality is provided by the JDK for primitives.
This package addresses this by providing a set of utility and collection classes for primitives.
Apache Commons Proxy
Commons Dynamic Proxy Library
Apache Commons SCXML
An implementation of the State Chart XML specification aimed at creating
and maintaining a Java SCXML engine. It is capable of executing an environment
agnostic state machine defined using a SCXML document.
Apache Commons VFS
VFS is a Virtual File System library.
Apache Commons Validator
Commons Validator provides the building blocks for both client side validation
and server side data validation. It may be used standalone or with a framework like
Apache Commons Weaver
Apache Commons Weaver provides an easy way to enhance compiled Java
classes by generating ("weaving") bytecode into those classes.
Apache Compress Ant Library
This is a library of Ant tasks and types uses Apache
Commons Compress to support additional archive formats like ar,
pack200, xz and cpio.
Whether you have a centralized build team or want to put control of releases in the hands of developers, Apache Continuum can help you improve quality and maintain a consistent build environment. Follow us on Twitter @apachecontinuum to get the latest news and updates!
The Apache Crunch Java library provides a framework for writing, testing, and running MapReduce pipelines. Its goal is to make pipelines that are composed of many user-defined functions simple to write, easy to test, and efficient to run.
Running on top of Hadoop MapReduce and Apache Spark, the Apache Crunch™ library is a simple Java API for tasks like joining and data aggregation that are tedious to implement on plain MapReduce. The APIs are especially useful when processing data that does not fit naturally into relational model, such as time series, serialized object formats like protocol buffers or Avro records, and HBase rows and columns. For Scala users, there is the Scrunch API, which is built on top of the Java APIs and includes a REPL (read-eval-print loop) for creating MapReduce pipelines.
New users of ZooKeeper are surprised to learn that a significant amount of connection management must be done manually. For example, when the ZooKeeper client connects to the ensemble it must negotiate a new session, etc. This takes some time. If you use a ZooKeeper client API before the connection process has completed, ZooKeeper will throw an exception. These types of exceptions are referred to as "recoverable" errors. Curator automatically handles connection management, greatly simplifying client code. Instead of directly using the ZooKeeper APIs you use Curator APIs that internally check for connection completion and wrap each ZooKeeper API in a retry loop. Curator uses a retry mechanism to handle recoverable errors and automatically retry operations. The method of retry is customizable. Curator comes bundled with several implementations (ExponentialBackoffRetry, etc.) or custom implementations can be written.
Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text. It processes clinical notes, identifying types of clinical named entities from various dictionaries including the Unified Medical Language System (UMLS) - medications, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has attributes for the text span, the ontology mapping code, subject (patient, family member, etc.) and context (negated/not negated, conditional, generic, degree of certainty). Some of the attributes are expressed as relations, for example the location of a clinical condition (locationOf relation) or the severity of a clinical condition (degreeOf relation).
Apache DataFu consists of two libraries:
Apache DataFu Pig is a collection of useful user-defined functions for data analysis in Apache Pig.
Apache DataFu Hourglass is a library for incrementally processing data using Apache Hadoop MapReduce. This library was inspired by the prevelance of sliding window computations over daily tracking data. Computations such as these typically happen at regular intervals (e.g. daily, weekly), and therefore the sliding nature of the computations means that much of the work is unnecessarily repeated. DataFu's Hourglass was created to make these computations more efficient, yielding sometimes 50-95% reductions in computational resources.
Apache DeltaSpike is a portable JSR-299 CDI (Contexts and Dependency Injection for Java) Extension library which contains lots of useful tools and helpers which are missing in the CDI core spec.
DeltaSpike is not a CDI-container itself, but a portable
Extension library which can run on all CDI-containers!
DeltaSpike is tested and runs on many Java EE Servers like Apache TomEE, Red Hat JBoss Application Server, JBoss Wildfly, Oracle WebLogic, Oracle Glassfish, IBM WebSphere, and also on simple Servlet containers like Apache Tomcat or Jetty in combination with either JBoss Weld or Apache OpenWebBeans.
Deltacloud contains a cloud abstraction API - whether the Deltacloud classic API, the DMTF CIMI API or even the EC2 API. Each abstraction API works as a wrapper around a large number of clouds, shielding users from their differences. For every cloud provider there is a driver "speaking" that cloud provider's native API, freeing you from dealing with the particulars of each cloud's API.
Apache Derby is an open source relational database implemented entirely in Java. It has a small footprint that makes it easy to embed in any Java-based application, but it also supports the more familiar client/server mode. It is based on the Java, JDBC, and SQL standards, making code developed more portable to standards-compliant databases.
Apache Devicemap is a data repository containing devices attributes, and their related browsers, and operating systems. The project also maintains an api to classify these attributes.
Apache DirectMemory is a off-heap cache for the Java Virtual Machine
The Apache Directory project provides directory solutions entirely written in Java. These include a directory server, which has been certified as LDAP v3 compliant by the Open Group (ApacheDS), and Eclipse-based directory tools (Apache Directory Studio).
Apache Directory Server
ApacheDS is an extensible and embeddable directory server entirely written in Java, which has been certified LDAPv3 compatible by the Open Group. Besides LDAP it supports Kerberos 5 and the Change Password Protocol. It has been designed to introduce triggers, stored procedures, queues and views to the world of LDAP which has lacked these rich constructs.
Apache Directory Studio
Apache Directory Studio is a complete directory tooling platform intended to be used with any LDAP server however it is particularly designed for use with ApacheDS. It is an Eclipse RCP application, composed of several Eclipse (OSGi) plugins, that can be easily upgraded with additional ones. These plugins can even run within Eclipse itself.
Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google's Dremel.
Apache Droids (incubating)
Apache Droids (incubating) aims to be an intelligent standalone robot
framework that allows to create robots as plugins, which can automatically seeks out
relevant online information based on the user's specifications. Droids makes it very
easy to extend existing robots or write a new one from scratch, which can
automatically seek out relevant online information based on the user's specifications.
Droids (plural) is not designed for a special usecase, it is a framework:
Take what you need, do what you want.
The Element Construction Set is a Java API for generating elements for various markup languages it directly supports HTML 4.0 and XML, but can easily be extended to create tags for any markup language.
Apache ESME (Enterprise Social Messaging Environment) is a secure and highly scalable microsharing and micromessaging platform that allows people to discover and meet one another and get controlled access to other sources of information, all in a business process context.
You can hardly turn a web page these days without seeing a story that describes how people are using social networks, whether it is Twitter, Facebook or some other service to develop and build their personal communities. In business, we increasingly see blogs and wikis demonstrating utility in problem solving and communications but the real time nature of business process problem solving largely remains untouched by social networking tools. Existing services, while attractive do not scale well and have proven unreliable. This is unacceptable to business which must be 'Always On' and able to support people in their daily working lives. Such applications must therefore be scalable and reliable but also provide a lot more.
When solving problems, how good might it be if a user was able to tap into the collective knowledge of her peers or surrounding groups of people with whom she might naturally network in the workplace setting? How much quicker and with greater precision might she be able to solve daily problems? What if there was a communications mechanism that takes the best of what services like Twitter offers and co-mingled that with readily recognizable business processes? That solution is Apache ESME.
Apache Empire-db is a relational database abstraction layer that allows developers to take a much more SQL-centric approach in application development than traditional Object/Relational mapping frameworks (ORM). With its unique object orientated command API it allows the creation of SQL-statements of any complexity that take full advantage of all DBMS features which leads to highly efficient database operations and code. Additionally by eliminating the use of error-prone string operations it also offers an unprecedented level of ease-of-use and compile-time-safety.
Etch is a cross-platform, language- and transport-independent framework for building and consuming network services. The Etch toolset includes a network service description language, a compiler, and binding libraries for a variety of programming languages. Etch is also transport-independent, allowing for a variety of different transports to be used based on need and circumstance. The goal of Etch is to make it simple to define small, focused services that can be easily accessed, combined, and deployed in a similar manner. With Etch, service development and consumption becomes no more difficult than library development and consumption.
The predecessor of Apache Avalon, Apache Excalibur hosts the Avalon framework, a Java container framework, the Excalibur and Fortress inversion of control containers, and a rich library of components. Excalibur code powers Apache James and Cocoon and numerous other open source and commercial projects.
FOP (Formatting Objects Processor) is the world's first print formatter driven by XSL formatting objects (XSL-FO) and the world's first output independent formatter. It is a Java application that reads a formatting object (FO) tree and renders the resulting pages to a specified output. Output formats currently supported include PDF, PCL, PS, SVG, XML (area tree representation), Print, AWT, MIF and TXT. The primary output target is PDF.
Apache Falcon is a data processing and management solution for Hadoop designed for data motion, coordination of data pipelines, lifecycle management, and data discovery. Falcon enables end consumers to quickly onboard their data and its associated processing and management tasks on Hadoop clusters.
OSGi framework implementation and related technologies.
Apache Flex® is a highly productive, open source application framework for building and maintaining expressive web applications that deploy consistently on all major browsers, desktops and devices (including smartphones, tablets and tv). It provides a modern, standards-based language and programming model that supports common design patterns suitable for developers from many backgrounds. Flex applications can be deployed to the ubiquitous Adobe® Flash® Player in the browser, Adobe® AIR™ on desktop and mobile or to native Android™, IOS™, QNX®, Windows® or Mac® applications.
Flink is an open source system for expressive, declarative, fast, and efficient data analysis. It combines the scalability and programming flexibility of distributed MapReduce-like platforms with the efficiency, out-of-core execution, and query optimization capabilities found in parallel databases.
Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store
Apache Forrest™ software is a publishing framework that transforms
input from various sources into a unified presentation in one or more
output formats. The modular and extensible plug-in architecture of
Apache Forrest is based on Apache Cocoon and the relevant industry
standards that separate presentation from content. Forrest can generate
static documents, or be used as a dynamic server, or be deployed by its
The Apache FtpServer application is a 100% pure Java FTP server. It's designed to be a complete and portable FTP server engine solution based on currently available open protocols. FtpServer can be run standalone as a Windows service or Unix/Linux daemon, or embedded into a Java application. We also provide support for integration within Spring applications and provide our releases as OSGi bundles.
Apache Geronimo is an open source server runtime that integrates the best open source projects to create Java/OSGi server runtimes that meet the needs of enterprise developers and system administrators. Our most popular distribution is a fully certified Java EE 5 application server runtime.
Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.
Although there are various excellent ORM frameworks for relational databases, data modeling in NoSQL data stores differ profoundly from their relational cousins. Moreover, data-model agnostic frameworks such as JDO are not sufficient for use cases, where one needs to use the full power of the data models in column stores. Gora fills this gap by giving the user an easy-to-use in-memory data model and persistence for big data framework with data store specific mappings and built in Apache Hadoop support.
Gump provides large scale continuous integration for various open source projects.
Use Apache HBase software when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.
Apache HTTP Server
The Apache HTTP Server is an open-source HTTP server for modern
operating systems including UNIX, Microsoft Windows, Mac OS/X and Netware.
The goal of this project is to provide a secure, efficient and
extensible server that provides HTTP services observing the current
HTTP standards. Apache has been the most popular web server on the
Internet since April of 1996.
Hadoop is a distributed computing platform. This includes the Hadoop Distributed Filesystem (HDFS) and an implementation of MapReduce.
The Apache Hama is an efficient and scalable general-purpose BSP computing engine which can be used to speed up a large variety of compute-intensive analytics applications.
Apache Harmony software is a modular Java runtime with class libraries and associated tools.
Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration.
The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides
* tools to enable easy data extract/transform/load (ETL)
* a mechanism to impose structure on a variety of data formats
* access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM)
* query execution via MapReduce
Hive defines a simple SQL-like query language, called HiveQL, that enables users familiar with SQL to query the data. At the same time, this language also allows programmers who are familiar with the MapReduce framework to be able to plug in their custom mappers and reducers to perform more sophisticated analysis that may not be supported by the built-in capabilities of the language. HiveQL can also be extended with custom scalar functions (UDF's), aggregations (UDAF's), and table functions (UDTF's).
Apache HttpComponents Client
HttpClient is a library for client-side HTTP communication built on HttpCore.
It provides connection management, cookie management, and authentication.
This is the successor to the widely used Jakarta Commons HttpClient 3.1.
Apache HttpComponents Core
HttpCore is a set of low level HTTP transport components that can be used to build custom
client and server side HTTP services with a minimal footprint. HttpCore supports two I/O
models: blocking I/O model based on the classic Java I/O and non-blocking, event driven I/O
model based on Java NIO. The blocking I/O model may be more appropriate for data intensive,
low latency scenarios, whereas the non-blocking model may be more appropriate for high latency
scenarios where raw data throughput is less important than the ability to handle thousands of
simultaneous HTTP connections in a resource efficient manner.
Apache Isis is a framework for rapidly developing domain-driven apps in Java. Write your business logic in entities, domain services and repositories, and the framework dynamically (at runtime) generates a representation of that domain model as a webapp or as a RESTful API. For prototyping or production.
Apache Ivy is a very powerful dependency manager oriented toward Java dependency management, even though it could be used to manage dependencies of any kind.
IvyDE lets you manage your dependencies declared in an ivy.xml in your Java Eclipse projects. IvyDE will contribute to the classpath of your Java project, with the classpath container. It also bring an editor of ivy.xml files, with completion.
The Apache Java Enterprise Mail Server (a.k.a. Apache James) is a 100% pure Java SMTP and POP3 Mail server and NNTP News server. We have designed James to be a complete and portable enterprise mail engine solution based on currently available open protocols.
James is also a mail application platform. We have developed a Java API to let you write Java code to process emails that we call the mailet API. A mailet can generate an automatic reply, update a database, prevent spam, build a message archive, or whatever you can imagine. A matcher determines whether your mailet should process an email in the server. The James project hosts the Mailet API, and James provides an implementation of this mail application platform API.
Apache JMeter may be used to test performance both on static and dynamic resources (files, Servlets, Perl scripts, Java Objects, Data Bases and Queries, FTP Servers and more). It can be used to simulate a heavy load on a server, network or object to test its strength or to analyze overall performance under different load types. You can use it to make a graphical analysis of performance or to test your server/script/object behavior under heavy concurrent load.
Apache JSPWiki is a feature-rich and extensible WikiWiki engine built around the standard JEE
components (Java, servlets, JSP). It features:
- WikiMarkup/Structured Text
- File attachments
- Templates support
- Data storage through 3 WikiPage Providers, with the capability to plug new ones
- Security: Authorization and authentication fine grain control
- Easy plugin interface
- UTF-8 support
- Easy-ish installation
- Page locking to prevent editing conflicts
- Support for Multiple Wikis
Apache Jackrabbit is a fully conforming implementation of the
Content Repository for Java Technology API (JCR). A content
repository is a hierarchical content store with support for
structured and unstructured content, full text search, versioning,
transactions, observation, and more. Typical applications that use
content repositories include content management, document management,
and records management systems.
Apache Jakarta Cactus
The intent of Cactus is to lower the cost of writing tests for server-side code. It uses JUnit and extends it.
Cactus implements an in-container strategy, meaning that tests are executed inside the container.
Apache Jena provides a complete framework for building Semantic Web and Linked Data applications in Java, and provides: parsers for RDF/XML, Turtle and N-triples; a Java programming API; a complete implementation of the SPARQL query language; a rule-based inference engine for RDFS and OWL entailments; TDB (a non-SQL persistent triple store); SDB (a persistent triples store built on a relational store) and Fuseki, an RDF server using web protocols. Jena complies with all relevant recommendations for RDF and related technologies from the W3C.
Apache jclouds is an open source multi-cloud toolkit for the Java platform that gives you the freedom to create applications that are portable across clouds while giving you full control to use cloud-specific features.
A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
Apache Karaf is a small OSGi based runtime which provides a lightweight container onto which various components and applications can be deployed.
The Apache Knox Gateway is a REST API Gateway for interacting with Hadoop clusters.
The Knox Gateway provides a single access point for all REST interactions with Hadoop clusters.
In this capacity, the Knox Gateway is able to provide valuable functionality to aid in the control,
integration, monitoring and automation of critical administrative and analytical needs of the enterprise.
Authentication (LDAP and Active Directory Authentication Provider)
Federation/SSO (HTTP Header Based Identity Federation)
Authorization (Service Level Authorization)
While there are a number of benefits for unsecured Hadoop clusters,
the Knox Gateway also complements the kerberos secured cluster quite nicely.
Coupled with proper network isolation of a Kerberos secured Hadoop cluster,
the Knox Gateway provides the enterprise with a solution that:
Integrates well with enterprise identity management solutions
Protects the details of the Hadoop cluster deployment (hosts and ports are hidden from endusers)
Simplifies the number of services that clients need to interact with
Apache Lenya is an Open Source Java/XML Content Management Framework and comes with revision control, site management, scheduling, search, WYSIWYG editors, and workflow.
Apache Libcloud is a standard Python library that abstracts away differences among multiple cloud provider APIs. It allows users to manage cloud servers, cloud storage and load-balancers.
Apache Lucene Core
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Lucene.Net is a port of the Lucene search engine library, written in C# and targeted at .NET runtime users. The Lucene search library is based on an inverted index. Lucene.Net has three primary goals:
1. Maintain the existing line-by-line port from Java to C#, fully automating and commoditizing the process such that the project can easily synchronize with the Java Lucene release schedule;
2. Maintaining the high-performance requirements expected of a first class C# search engine library;
3. Maximize usability and power when used within the .NET runtime. To that end, it will present a highly idiomatic, carefully tailored API that takes advantage of many of the special features of the .NET runtime.
The Apache Lucy search engine library provides full-text search for dynamic programming languages.
Apache log4cxx provides logging services for C++.
Apache log4j provides logging services for Java.
Apache log4net provides logging services for .NET.
Apache log4php is a logging framework for PHP.
Apache MINA is a network application framework which helps users develop high performance and high scalability network applications easily. It provides an abstract, event-driven, asynchronous API over various transports such as TCP/IP and UDP/IP via Java NIO.
The Apache MRUnit is a Java library that helps developers unit test Apache Hadoop map reduce jobs.
Scalable machine learning library
ManifoldCF is an effort to provide an open source framework for connecting source content repositories like Microsoft Sharepoint and EMC Documentum, to target repositories or indexes, such as Apache Solr , OpenSearchServer or ElasticSearch. ManifoldCF also defines a security model for target repositories that permits them to enforce source-repository security policies.
The goal of Apache Marmotta is to provide an open implementation of a Linked Data Platform that can be used, extended and deployed easily by organizations who want to publish Linked Data or build custom applications on Linked Data
Maven is a project development management and comprehension tool. Based on the concept of a project object model: builds, dependency management, documentation creation, site publication, and distribution publication are all controlled from the declarative file. Maven can be extended by plugins to utilise a number of other development tools for reporting or the build process.
Apache Mesos is a cluster manager that provides efficient resource isolation
and sharing across distributed applications, or frameworks. It can run Hadoop,
MPI, Hypertable, Spark, and other frameworks on a dynamically shared pool of
With MetaModel you get a uniform connector and query API to many very different datastore types, including: Relational (JDBC) databases, CSV files, Excel spreadsheets, XML files, JSON files, Fixed width files, MongoDB, Apache CouchDB, Apache HBase, Apache Cassandra, ElasticSearch, OpenOffice.org databases, Salesforce.com, SugarCRM and even collections of plain old Java objects (POJOs).
MetaModel isn't a data mapping framework. Instead we emphasize abstraction of metadata and ability to add data sources at runtime, making MetaModel great for generic data processing applications, less so for applications modeled around a particular domain.
MyFaces is the free open source implementation of JavaServer(tm) Faces, a new and upcoming web application framework that accomplishes the MVC paradigm. It is comparable to the well-known Struts Framework but has features and concepts that are beyond those of Struts - especially the component orientation.
mod_ftp is an FTP Protocol module to serve httpd content over the FTP
protocol (whereever the HTTP protocol could also be used). It provides
both RETR/REST retrieval and STOR/APPE upload, using the same user/permissions
model as httpd (so it shares the same security considerations as mod_dav
mod_perl is a unique piece of software that integrates the power of
Perl with the flexibility and stability of the Apache Web server.
With mod_perl, you can harness the power of the full Apache API with
Perl and develop Web applications quickly, without sacrificing
Apache Nutch is a highly extensible and scalable open source web crawler software project. Stemming from Apache Lucene, the project has diversified and now comprises two codebases, namely: Nutch 1.x: A well matured, production ready crawler. 1.x enables fine grained configuration, relying on Apache Hadoop data structures, which are great for batch processing. Nutch 2.x: An emerging alternative taking direct inspiration from 1.x, but which differs in one key area; storage is abstracted away from any specific underlying data store by using Apache Gora for handling object to persistent mappings. This means we can implement an extremely flexibile model/stack for storing everything (fetch time, status, content, parsed text, outlinks, inlinks, etc.) into a number of NoSQL storage solutions. Being pluggable and modular of course has it's benefits, Nutch provides extensible interfaces such as Parse, Index and ScoringFilter's for custom implementations e.g. Apache Tika for parsing. Additonally, pluggable indexing exists for Apache Solr, Elastic Search, etc. Nutch can run on a single machine, but gains a lot of its strength from running in a Hadoop cluster
Apache ODE (Orchestration Director Engine) executes business processes written following the WS-BPEL standard. It talks to web services, sending and receiving messages, handling data manipulation and error recovery as described by your process definition. It supports both long and short living process executions to orchestrate all the services that are part of your application.
WS-BPEL is an XML-based language defining several constructs to write business processes. It defines a set of basic control structures like conditions or loops as well as elements to invoke web services and receive messages from services. It relies on WSDL to express web services interfaces. Message structures can be manipulated, assigning parts or the whole of them to variables that can in turn be used to send other messages.
Apache OFBiz is an open source product for the automation of enterprise processes that includes framework components and business applications
for ERP (Enterprise Resource Planning), CRM (Customer Relationship Management), E-Business / E-Commerce, SCM (Supply Chain Management),
MRP (Manufacturing Resource Planning), MMS/EAM (Maintenance Management System/Enterprise Asset Management), POS (Point Of Sale).
Apache OFBiz provides a foundation and starting point for reliable, secure and scalable enterprise solutions.
Apache OODT software is component based, and offers a software architecture beyond simple science applications.
A set of text-processing Java classes that provide Perl5 compatible regular expressions, AWK-like regular expressions, glob expressions, and utility classes for performing substitutions, splits, filtering filenames, etc.
Apache Olingo is a Java library that implements the Open Data Protocol (OData). Apache Olingo serves client and server aspects of OData. It currently supports OData 2.0 and support OData 4.0 (beta). The latter is the OASIS version of the protocol: OASIS Open Data Protocol (OData) TC.
The extensions part of Olingo for OData 2.0 contains additional features like the support of JPA persistency or annotated bean classes.
Apache Oltu - Parent
Apache Oltu is an OAuth protocol implementation in Java.
Apache Onami is a project focused on the development and maintenance of a set of Google Guice extensions.
Oozie is a workflow scheduler system to manage Apache Hadoop jobs. Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
Apache Open Climate Workbench
Apache Open Climate Workbench is an effort to develop software that performs climate model evaluation using model outputs from a variety of different sources (the Earth System Grid Federation, the Coordinated Regional Downscaling Experiment, the U.S. National Climate Assessment and the North American Regional Climate Change Assessment Program) and temporal/spatial scales with remote sensing data from NASA, NOAA and other agencies. The toolkit includes capabilities for rebinning, metrics computation and visualization.
Apache OpenJPA is a Java persistence project at The Apache Software Foundation that can be used as a stand-alone POJO persistence layer or integrated into any Java EE compliant container and many other lightweight frameworks, such as Tomcat and Spring. The 1.x releases are a production ready, feature-rich, compliant implementation of the Java Persistence API (JPA) 1.0 part of the JSR-220 Enterprise Java Beans 3.0 specification, which pass the Sun JPA 1.0b Technology Compatibility Kit. The 2.x releases are a production ready, compliant implement of the JSR-317 Java Persistence 2.0 specification, which is backwards compatible to the JPA 1.0 specification and passes the Sun JPA 2.0 Technology Compatibility Kit.
The Apache Openmeetings provides video conferencing, instant messaging, white board,
collaborative document editing and other groupware tools using API functions
of the Red5 Streaming Server for Remoting and Streaming.
Apache OpenNLP software supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning..
Apache OpenOffice is an open-source, office-document productivity
suite providing six productivity applications based around the
OpenDocument Format (ODF). OpenOffice is released on multiple
platforms and in dozens of languages.
OpenWebBeans is an ALv2-licensed implementation of the "Contexts and Dependency Injection for the Java EE platform" specification which is defined as JSR-299.
ORC is a self-describing type-aware columnar file format designed for
Hadoop workloads. It is optimized for large streaming reads, but with
integrated support for finding required rows quickly. Storing data in
a columnar format lets the reader read, decompress, and process only
the values that are required for the current query.
The Apache PDFBox library is an open source Java tool for working with PDF documents.
APIs for manipulating various file formats based upon Open Office XML (ECMA-376) and Microsoft's OLE 2 Compound Document formats using pure Java. Apache POI is your Java Excel, Word and PowerPoint solution. We have a complete API for porting other OOXML and OLE 2 Compound Document formats and welcome others to participate.
Apache Parquet is a general-purpose columnar storage format, built for Hadoop, usable with any choice of data processing framework, data model, or programming language.
Apache Phoenix is a SQL query engine for accessing NoSQL datastores such as Apache HBase. It is accessed as a JDBC driver and enables querying, updating, and managing NoSQL tables through standard SQL. Instead of using map-reduce, Apache Phoenix compiles your SQL query into a series of HBase scans and orchestrates the running of those scans to produce regular JDBC result sets. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs. Pig's language layer consists of a textual language called Pig Latin, which has the following key properties:
* Ease of programming. It is trivial to achieve parallel execution of simple, "embarrassingly parallel" data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain.
* Optimization opportunities. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency.
* Extensibility. Users can create their own functions to do special-purpose processing.
Apache Pivot is an open-source platform for building installable Internet applications (IIAs).
It combines the enhanced productivity and usability features of a modern user interface toolkit with the robustness of the Java platform.
Apache Portable Runtime
The mission of the Apache Portable Runtime (APR) project is to create
and maintain software libraries that provide a predictable and
consistent interface to underlying platform-specific implementations.
The primary goal is to provide an API to which software developers may
code and be assured of predictable if not identical behaviour
regardless of the platform on which their software is built, relieving
them of the need to code special-case conditions to work around or
take advantage of platform-specific deficiencies or features.
The Apache Portals project provides various software products, including Apache Jetspeed-2, Apache Pluto, and Apache Portals Applications.
Apache Props Ant Library
The Apache Props Antlib is a library of supplementary handlers for Apache Ant properties resolution.
The types provided are instances of org.apache.tools.ant.PropertyHelper.Delegate and can be invoked using the <propertyhelper> task provided in Ant 1.8.0.
Apache Qpid implements the latest AMQP specification, the first open standard for enterprise messaging, and provides transaction management, queuing, distribution, security, management, clustering, federation and heterogeneous multi-platform support and a lot more.
Apache Rat improves accuracy and efficiency when reviewing and auditing releases.
It is heuristic in nature: making guesses about possible problems.
It will produce false positives and cannot find every possible issue with a release.
It's reports require interpretation.
Apache Rat was developed in response to a need felt in the Apache Incubator to be able to
review releases for the most common faults less labour intensively. It is therefore highly tuned
to the Apache style of releases.
Apache Rave is a new web and social mashup engine. It will provide an out-of-the-box as well as an extendible lightweight Java platform to host, serve and aggregate (Open)Social Gadgets and services through a highly customizable and Web 2.0 friendly front-end. Rave is targeted as engine for internet and intranet portals and as building block to provide context-aware personalization and collaboration features for multi-site/multi-channel (mobile) oriented and content driven websites and (social) network oriented services and platforms. For the OpenSocial container and services the (Java) Apache Shindig will be integrated. At a later stage further generalization is envisioned to also transparently support W3C Widgets using Apache Wookie.
100% Pure Java Regular Expression package
Apache Reusable Dialog Components (RDC) Taglib
Server-side generation of HTML has proven an effective way of generating
the user interface for visual web applications. Over time, the effort
involved in such HTML generation has been reduced by the availability of
various JSP tag libraries that abstract away the minutiae of HTML markup.
The RDC project aims to achieve for voice and multimodal applications
what JSP tag libraries have already achieved in the world of visual web
Apache River software provides a JINI service, which is a service oriented architecture that defines a programming model which both exploits and extends Java technology to enable the construction of secure, distributed systems consisting of federations of services and clients. Jini technology can be used to build adaptive network systems that are scalable, evolvable and flexible as typically required in dynamic computing environments.
Apache Rivet is a system for creating dynamic web content via the Tcl programming language integrated with Apache Web Server. It is designed to be fast, powerful and extensible, consume few system resources, be easy to learn, and to provide the user with a platform that can also be used for other programming tasks outside the web (GUI's, system administration tasks, text processing, database manipulation, XML, and so on). In order to meet these goals Tcl programming language was chosen to combine with the Apache HTTP Server.
Apache Roller is a full-featured, multi-user and group-blog server suitable for blog sites large and small. It runs as a Java web application that should be able to run on most any Java EE server and relational database. Roller's installation guide covers deployment on Tomcat, GlassFish, and JBoss application servers using a MySQL, Derby, or PostgreSQL database. Users however have reported success running Roller on other app servers and databases.
- Multi-user blogging: can support tens of thousands of users and blogs
- Group blogging with three permisson levels (editor, author and limited)
- Support for comment moderation and comment spam prevention measures
- Bloggers have complete control over blog layout/style via Apache Velocity-driven templates
- Built-in search engine indexes weblog entry content
- Pluggable cache and rendering system
- Support for blog clients that support MetaWeblog API
- All blogs have entry and comment feeds in both RSS 2.0 and Atom 1.0 formats
Apache SSHD is a 100% pure java library to support the SSH protocols on both the client and server side. This library is based on Apache MINA, a scalable and high performance asynchronous IO library. SSHD does not really aim at being a replacement for the SSH client or SSH server from Unix operating systems, but rather provides support for Java based applications requiring SSH support.
Apache Samza provides a system for processing stream data from publish-subscribe systems such as Apache Kafka. The developer writes a stream processing task, and executes it as a Samza job. Samza then routes messages between stream processing tasks and the publish-subscribe systems that the messages are addressed to.
Apache Sandesha2 is an Axis2 module that implements the WS-ReliableMessaging specification. It can be used both on the client side and on the server side.
Library implementing XML Digital Signature Specification & XML Encryption Specification
Apache Scout is an implementation of the JSR 93 (JAXR). It provides an implementation to access UDDI registries (particularly Apache jUDDI) in a standard way.
Apache ServiceMix is a flexible, open-source integration container that unifies the features and functionality of Apache ActiveMQ, Camel, CXF, and Karaf into a powerful runtime platform you can use to build your own integrations solutions. It provides a complete, enterprise ready ESB exclusively powered by OSGi.
Shale is a modern web application framework, fundamentaly based on JavaServer Faces, and focused on improving ease of use for developers adopting JSF as a foundational technology in their own development environments.
Apache Shindig is a container for hosting social application consisting of four parts:
OpenSocial Data Server: an implementation of the server interface to container-specific information, including the OpenSocial REST APIs, with clear extension points so others can connect it to their own backends.
Apache Shindig is the reference implementation of OpenSocial API specifications, versions 0.8.x and 0.9.x, a standard set of Social Network APIs.
Apache Shiro is a powerful and easy-to-use Java security framework that performs authentication,
authorization, cryptography, and session management. With Shiro’s easy-to-understand API, you can quickly
and easily secure any JVM-based application – from the smallest mobile applications to the largest web
and enterprise applications.
Apache Sling is a web framework that uses a Java Content Repository, such as Apache Jackrabbit,
to store and manage content.
Sling applications use either scripts or Java servlets, selected based on simple name conventions,
to process HTTP requests in a RESTful way.
The embedded Apache Felix OSGi framework and console provide a dynamic runtime environment, where
code and content bundles can be loaded, unloaded and reconfigured at runtime.
As the first web framework dedicated to JSR-170 Java Content Repositories, Sling makes it very
simple to implement simple applications, while providing an enterprise-level framework for more complex applications.
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON, Ruby, and Python APIs, hit highlighting, faceted search, caching, replication, and a web administration interface.
Apache SpamAssassin is an extensible email filter which is used to identify spam. Using its rule base, it uses a wide range of advanced heuristic and statistical analysis tests on mail headers and body text to identify "spam", also known as unsolicited bulk email. Once identified, the mail can then be optionally tagged as spam for later filtering. It provides a command line tool to perform filtering, a client-server system to filter large volumes of mail, and Mail::SpamAssassin, a set of Perl modules.
Apache Spark is a fast and general engine for large-scale data processing. It offers high-level APIs in Java, Scala and Python as well as a rich set of libraries including stream processing, machine learning, and graph analytics.
Apache Spatial Information System
Apache SIS provides data structures for geographic data and associated metadata
along with methods to manipulate those data structures. The library is an implementation of GeoAPI interfaces
and can be used for desktop or server applications.
Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Apache Stanbol is a modular software stack and reusable set of components for semantic content management.
Apache STeVe is a collection of online voting tools, used by the ASF, to handle STV and other voting methods.
Apache Storm is a distributed real-time computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing real-time computation.
Apache Stratos is a highly-extensible Platform-as-a-Service (PaaS) framework that helps run Apache Tomcat, PHP, and MySQL applications and can be extended to support many more environments on all major cloud infrastructures. For developers, Stratos provides a cloud-based environment for developing, testing, and running scalable applications. IT providers benefit from high utilization rates, automated resource management, and platform-wide insight including monitoring and billing.
The Apache Struts Project offers the Apache Struts 2 web framework which is a comprehensive and modular tooling stack for creating web-based Java applications. Struts 2, emerged from the WebWork 2 framework, is an excellent choice for teams who value elegant solutions to difficult problems.
Subversion exists to be universally recognized and adopted as an open-source, centralized version control system characterized by its reliability as a safe haven for valuable data; the simplicity of its model and usage; and its ability to support the needs of a wide variety of users and projects, from individuals to large-scale enterprise operations.
Apache Synapse is a simple and highly effective ESB, Web Services intermediary and SOA framework. It can be
added to your existing network very simply either as a services gateway or as an HTTP proxy. Once Apache
Synapse is mediating your service requests it can perform many functions including routing, load-balancing,
transformation and protocol switching. Apache Synapse can be used to build an Enterprise Service Bus (ESB) or
Service Oriented Architecture (SOA).
Apache Synapse has been designed to support very fast XML routing with a streaming XML design based upon
Apache Axiom. in addition, the use of a completely asynchronous architecture and non-blocking IO based on Java NIO
ensures that Synapse has very low overhead and can scale to support thousands of concurrent clients without dropping
Apache Syncope is an Open Source system for managing digital identities in enterprise environments, implemented in JEE technology and released under Apache 2.0 license.
Identity management (or IdM) represents the joint result of business process and IT to manage user data on systems and applications. IdM involves considering user attributes, roles, resources and entitlements in trying to give a decent answer to the question bumping at every time in IT administrators' mind:
Who has access to What, When, How, and Why?
The main goal of Apache Tajo project is to build an advanced open
source data warehouse system in Hadoop for processing web-scale data
sets. Basically, Tajo provides SQL standard as a query language.
Tajo is designed for both interactive and batch queries on data sets
stored on HDFS and other data sources. Without hurting query response
times, Tajo provides fault-tolerance and dynamic load balancing which
are necessary for long-running queries. Tajo employs a cost-based and
progressive query optimization techniques for reoptimizing running
queries in order to avoid the worst query plans.
Tapestry is a component-oriented Java web application framework.
Its design emphasizes ease of use and developer productivity. Component
classes are simple POJOs, with Tapestry using byte code manipulation to
enhance classes at runtime. Configuration is via annotations and naming
conventions rather than XML. Web page and component templates use regular
(X)HTML that can be edited by any web designer. Live Class Reloading enables
you to edit Java code and immediately see results by reloading the page in
the web browser, resulting in a very fast "code it - see it - fix it" loop.
Taverna include the Workbench (desktop client application), the Command Line Tool (for a quick execution of workflows from a terminal), the Server (for remote execution of workflows) and the Player (Web interface plugin for submitting workflows for remote execution). The Taverna Platform gives OSGi-based programmatic access to the Taverna workflow engine.
Apache Tentacles helps the reviewer by automating interactions with the repository containing the artifacts comprising the release.
Texen is a general purpose text generating utility. It is capable of producing almost any sort of text output. Driven by Ant, essentially an Ant Task, Texen uses a control template, an optional set of worker templates, and control context to govern the generated output. Although TexenTask can be used directly, it is usually subclassed to initialize your control context before generating any output.
Apache Thrift allows you to define data types and service interfaces in a simple definition file. Taking that file as input, the compiler generates code to be used to easily build RPC clients and servers that communicate seamlessly across programming languages. Instead of writing a load of boilerplate code to serialize and transport your objects and invoke remote methods, you can get right down to business.
PMC: Apache Thrift
The Apache Tika toolkit is an ASFv2 licensed open source tool for extracting information
from digital documents. Tika allows search engines, content management systems and other
applications that work with various kinds of digital documents to easily detect and extract
metadata and content from all major file formats.
Apache Tiles™ is a templating framework built to simplify the
development of web application user interfaces.
Tiles allows authors to define page fragments which can be assembled into a
complete page at runtime. These fragments, or tiles, can be used as simple
includes in order to reduce the duplication of common page elements or embedded
within other tiles to develop a series of reusable templates. These templates
streamline the development of a consistent look and feel across an entire application.
The goal of Tobago is to provide the community with a well designed set of user interface components based on JSF.
Apache TomEE Web Profile delivers Servlets, JSP, JSF, JTA, JPA, CDI, Bean Validation and EJB
Lite. Apache TomEE Plus has all the features of TomEE with the addition of JAX-RS (RESTfull Services),
JAX-WS (Web Services), JMS (Java Message Service) and JCA (the Java Connector Architecture). The
additional functionality is delivered via Apache CXF, Apache ActiveMQ and the Geronimo Connector library
Apache Tomcat is a web server that is an open source software
implementation of the Java Servlet and JavaServer Pages technologies.
The Java Servlet and JavaServer Pages specifications are developed under
the Java Community Process. Apache Tomcat is developed in an open and
participatory environment and released under the Apache License version 2.
Apache Tomcat is intended to be a collaboration of the best-of-breed
developers from around the world. We invite you to participate in this open
Apache Tomcat powers numerous large-scale, mission-critical web applications
across a diverse range of industries and organizations. Some of these users
and their stories are listed on the PoweredBy wiki page.
Torque is an object-relational mapper for Java. In other words, Torque lets you access and manipulate data in a relational database using java objects. Unlike most other object-relational mappers, Torque does not use reflection to access user-provided classes, but it generates the necessary classes (including the Data Objects) from an XML schema describing the database layout (which can either be written by hand or generated from an existing database). The XML schema can also be used to generate and execute a SQL script which creates all the tables in the database.
Apache Traffic Server
Apache Traffic Server is fast, scalable and extensible HTTP/1.1 compliant caching proxy server. ATS can be used as a reverse, forward or even transparent HTTP proxy.
Turbine is a servlet based framework that allows experienced Java developers to quickly build web applications. Turbine allows you to use personalize the web sites and to use user logins to restrict access to parts of your application.
Turbine is a matured and well established framework that is used as the base of many other projects (like e.g. the excellent Jetspeed 1 Portals framework.
Turbine is an excellent choice for developing applications that make use of a services-oriented architecture. Some of the functionality provided with Turbine includes a security management system, a scheduling service, XML-defined form validation server, and an XML-RPC service for web services. It is a simple task to create new services particular to your application.
The Turbine core is free of any dependency on a presentation layer technology. Both JavaServer Pages (JSP) and Velocity are supported inside Turbine. For developers already familiar with JSP, or have existing JSP tag libraries, Turbine offers support for the Sun standard. Velocity is the favorite view technology of most users of the Turbine framework; try it out and see if Velocity can help you develop your web applications faster and work more easily with non-programming designers.
Turbine is developed in an open, participatory environment and released under the Apache Software License. Turbine is intended to be a collaboration of the best-of-breed developers from around the world. We invite you to participate in this open development project. To learn more about getting involved, look at our "How to Help" pages.
Apache Tuscany simplifies the task of developing SOA solutions by providing a comprehensive infrastructure for SOA development and management that is based on Service Component Architecture (SCA) standard. With SCA as it's foundation, Tuscany offers solution developers the following advantages:
Provides a model for creating composite applications by defining the services in the fabric and their relationships with one another. The services can be implemented in any technology.
Enables service developers to create reusable services that only contain business logic. Protocols are pushed out of business logic and are handled through pluggable bindings. This lowers development cost.
Applications can easily adapt to infrastructure changes without recoding since protocols are handled via pluggable bindings and quality of services (transaction, security) are handled declaratively.
Existing applications can work with new SCA compositions. This allows for incremental growth towards a more flexible architecture, outsourcing or providing services to others.
The Apache UIMA project supports
the community working on the analysis of unstructured information
with a unifying Java and C++ framework, tooling,
and analysis components, guided by the OASIS UIMA standard.
It includes support for very large scaleout using networked
clusters of compute nodes.
VCL is a modular cloud computing platform which dynamically provisions and brokers remote access to compute resources including virtual machines, bare-metal computers, and resources in other cloud platforms. A self-service web portal is used to request resources and for administration.
Apache VSS Ant Library
The Apache VSS Antlib provides an interface to the Microsoft Visual SourceSafe SCM. The original Ant tasks have been expanded upon in this Antlib. Some fixes to issues in the original tasks have also been incorporated.
Apache VXQuery will be a standards compliant XML Query processor implemented in Java. The focus is on the evaluation of queries on large amounts of XML data. Specifically the goal is to evaluate queries on large collections of relatively small XML documents. To achieve this queries will be evaluated on a cluster of shared nothing machines.
Velocity is a Java-based template engine. It permits anyone to use a simple yet powerful template language to reference objects defined in Java code.
When Velocity is used for web development, Web designers can work in parallel with Java programmers to develop web sites according to the Model-View-Controller (MVC) model, meaning that web page designers can focus solely on creating a site that looks good, and programmers can focus solely on writing top-notch code. Velocity separates Java code from the web pages, making the web site more maintainable over its lifespan and providing a viable alternative to Java Server Pages (JSPs) or PHP.
Apache Velocity DVSL
DVSL (Declarative Velocity Style Language) is a tool modeled after XSLT and is intended for general XML transformations using the Velocity Template Language as the templating language for the transformations. The key differences are that it incorporates easy access to Java objects and allows you to use the Velocity template language and it's features for expressing the transformation templates.
Apache Velocity Tools
VelocityTools is a collection of Velocity subprojects with a common goal of creating tools and infrastructure for building both web and non-web applications using the Velocity template engine.
Apache Vysper aims to be a modular, full featured XMPP (Jabber) server. Vysper is implemented in Java.
Websh is a rapid development environment for building powerful, fast, and reliable web applications in Tcl. Websh is versatile and handles everything from HTML generation to data-base driven one-to-one page customization. Websh can be run in CGI environments and as Apache module.
Apache Whirr is a set of libraries for running cloud services
1. A cloud-neutral way to run services. You don't have to worry about the idiosyncrasies of each provider.
2. A common service API. The details of provisioning are particular to the service.
3. Smart defaults for services. You can get a properly configured system running quickly, while still being able to override settings as needed.
You can also use Whirr as a command line tool for deploying clusters.
Apache Whisker allows an application to models the licensing characteristics of the contents of its distributions. Use cases are auditing the model against the contents of a distribution, reporting on the contents of a distribution and generation licensing documents (LICENSE, NOTICE and so on) for a distribution. Whisker distributes tooling for the command line and build system such as Maven.
With proper mark-up/logic separation, a POJO data model, and a refreshing lack of XML, Apache Wicket makes developing web-apps simple and enjoyable again. Swap the boilerplate, complex debugging and brittle code for powerful, reusable components written with plain Java and HTML.
Apache Wink is a simple yet solid framework for building RESTful Web services. It is comprised of a Server module and a Client module for developing and consuming RESTful Web services.
The Woden project is a subproject of the Apache Web Services Project to develop a Java class library for reading, manipulating, creating and writing WSDL documents, initially to support WSDL 2.0 but with the longer term aim of supporting past, present and future versions of WSDL.
There are two main deliverables: an API and an implementation. The Woden API will consist of a set of Java interfaces. The WSDL 2.0-specific portion of the Woden API will conform to the W3C WSDL 2.0 specification. The implementation will be a high performance implementation directly usable in other Apache projects such as Axis2.
Apache Wookie is a Java server application that allows you to upload and deploy widgets for your applications; widgets can not only include all the usual kinds of mini-applications, badges, and gadgets, but also fully-collaborative applications such as chats, quizzes, and games.
Apache XML Commons External
The External components portion of Apache XML Commons contains interfaces that are defined by external standards organizations. For DOM, that's the W3C; for SAX it's David Megginson (http://www.saxproject.org); for JAXP it's Sun. While we could send users to each of the primary sources for these deliverables, keeping our own versions of these in the XML Commons repository gives us a number of advantages: 1) Simplicity of downloads; users get the whole product from one place, 2) Better version control; we can only take fixes we want and add Apache-specific changes, 3) Better overview documentation of how these interfaces fit into the XML processing world, 4) More chance for cross-project community building within Apache projects.
Apache XML Commons Resolver
The XML Commons Resolver can be used in a wide variety of XML parsing, processing and related programs to resolve various public or system identifiers into accessible URLs for use by your application. The resolver supports several catalog types for mapping, including OASIS XML, OASIS TR 9401 and XCatalog styles.
Apache XML Graphics Commons
Apache XML Graphics Commons is a library that consists of several reusable components used by Apache Batik and Apache FOP. Many of these components can easily be used separately outside the domains of SVG and XSL-FO. You will find components such as a PDF library, an RTF library, Graphics2D implementations that let you generate PDF and PostScript files and much more.
XMLBeans is a tool that allows you to access the full power of XML in a Java friendly way. The idea is that you can take advantage of the richness and features of XML and XML Schema and have these features mapped as naturally as possible to the equivalent Java language and typing constructs. XMLBeans uses XML Schema to compile Java interfaces and classes that you can then use to access and modify XML instance data. Using XMLBeans is similar to using any other Java interface/class, you will see things like getFoo or setFoo just as you would expect when working with Java. While a major use of XMLBeans is to access your XML instance data with strongly typed Java classes there are also API's that allow you access to the full XML infoset (XMLBeans keeps XML Infoset fidelity) as well as to allow you to reflect into the XML schema itself through an XML Schema Object model.
For more details on XMLBeans see the XMLBeans Wiki pages or the XMLBeans documentation (the Documentation tab on this website).
What Makes XMLBeans Different
There are at least two major things that make XMLBeans unique from other XML-Java binding options.
1. Full XML Schema support. XMLBeans fully supports XML Schema and the corresponding java classes provide constructs for all of the major functionality of XML Schema. This is critical since often times you do not have control over the features of XML Schema that you need to work with in Java. Also, XML Schema oriented applications can take full advantage of the power of XML Schema and not have to restrict themselvs to a subset.
2. Full XML Infoset fidelity.When unmarshalling an XML instance the full XML infoset is kept and is available to the developer. This is critical because because of the subset of XML that is not easily represented in java. For example, order of the elements or comments might be needed in a particular application.
A major objective of XMLBeans has been to be applicable in all non-streaming (in memory) XML programming situations. You should be able to compile your XML Schema into a set of java classes and know that 1) you will be able to use XMLBeans for all of the schemas you encounter (even the warped ones) and 2) that you will be able to get to the XML at whatever level is necessary - and not have to resort to multple tools to do this.
To accomplish this XMLBeans provides three major APIs:
* XmlObject The java classes that are generated from an XML Schema are all derived from XmlObject. These provide strongly typed getters and setters for each of the elements within the defined XML. Complex types are in turn XmlObjects. For example getCustomer might return a CustomerType (which is an XmlObject). Simple types turn into simple getters and setters with the correct java type. For example getName might return a String.
* XmlCursor From any XmlObject you can get an XmlCursor. This provides efficient, low level access to the XML Infoset. A cursor represents a position in the XML instance. You can move the cursor around the XML instance at any level of granularity you need from individual characters to Tokens.
* SchemaType XMLBeans provides a full XML Schema object model that you can use to reflect on the underlying schema meta information. For example, you might want to generate a sample XML instance for an XML schema or perhaps find the enumerations for an element so that you can display them.
All of this was built with performance in mind. Informal benchmarks and user feedback indicate that XMLBeans is extremely fast.
Apache Xalan for C++ XSLT Processor
Xalan-C++ is an XSLT processor for transforming XML documents into HTML,
text, or other XML document types. It implements XSL Transformations (XSLT)
Version 1.0 and XML Path Language (XPath) Version 1.0 and can be used from the
Apache Xalan for Java XSLT Processor
Xalan-J is an XSLT processor for transforming XML documents into HTML, text, or other XML document
types. It implements XSL Transformations (XSLT) Version 1.0 and XML Path Language (XPath) Version 1.0
and can be used from the command line, in an applet or a servlet, or as a module in other program.
Xalan-J implements the javax.xml.transform interface in Java API for XML Processing (JAXP) 1.3. This
interface provides a modular framework and a standard API for performing XML transformations, and
utilizes system properties to determine which Transformer and which XML parser to use.
Apache Xerces for C++ XML Parser
Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data. A shared library is provided for parsing, generating, manipulating, and validating XML documents.
Xerces-C++ is faithful to the XML 1.0 and 1.1 recommendations and many associated standards.
The parser provides high performance, modularity, and scalability. Source code, samples and API documentation are provided with the parser. For portability, care has been taken to make minimal use of templates, no RTTI, and minimal use of #ifdefs.
Apache Xerces for Java XML Parser
Xerces-J is a high performance, fully compliant validating XML parser written in Java. It is a fully conforming XML Schema processor that includes a complete implementation of the Document Object Model Level 3 Core and Load/Save W3C Recommendations and provides a complete implementation of the XML Inclusions (XInclude) W3C Recommendation. It also provides support for OASIS XML Catalogs v1.1.
Xerces 2.x introduced the Xerces Native Interface (XNI), a complete framework for building parser components and configurations that is extremely modular and easy to program. XNI is merely an internal set of interfaces. There is no need for an XML application programmer to learn XNI if they only intend to interface to the Xerces2 parser using standard interfaces like JAXP, DOM, and SAX. Xerces developers and application developers that need more power and flexibility than that provided by the standard interfaces should read and understand XNI.
The latest version released, 2.11.0, expands on Xerces' experimental support for XML Schema 1.1 by providing implementations for the simplified complex type restriction rules (also known as subsumption), xs:override and a few other XML Schema 1.1 features. This release also introduces experimental support for XML Schema Component Designators (SCD). It fixes several bugs which were present in Xerces-J 2.10.0 and also includes a few other minor enhancements.
Apache Xerces for Perl XML Parser
XML::Xerces is the Perl API to the Apache project's Xerces XML parser. It is implemented using the Xerces C++ API, and it provides access to most of the C++ API from Perl.
Because it is based on Xerces-C, XML::Xerces provides a validating XML parser that makes it easy to give your application the ability to read and write XML data. Classes are provided for parsing, generating, manipulating, and validating XML documents. XML::Xerces is faithful to the XML 1.0 and 1.1 recommendations and associated standards (DOM levels 1, 2, and 3, SAX 1 and 2, Namespaces, and W3C XML Schema). The parser provides high performance, modularity, and scalability, and provides full support for Unicode.
XML::Xerces implements the vast majority of the Xerces-C API (if you notice any discrepancies please mail the list). The exception is some functions in the C++ API which either have better Perl counterparts (such as file I/O) or which manipulate internal C++ information that has no role in the Perl module.
The majority of the API is created automatically using Simplified Wrapper Interface Generator (SWIG). However, care has been taken to make most method invocations natural to perl programmers, so a number of rough C++ edges have been smoothed over (See the Special Perl API Features section).
Pure Java based native XML database. Supports XPath and XUpdate.
Apache Zest is a project that explores the Composite Oriented Programming paradigm, where Fragments get compoosed into Composite, which are placed into Modules, placed inside Layers, to enforce Application Structure. Classes are Dead, Long Live Interfaces.
Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination.