FACTOID # 168: The average Irish worker must work twice as long as the average Brit to buy a car.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RELATED ARTICLES
People who viewed "Lucene" also viewed:
RECENT ARTICLES
More Recent Articles »
 

Encyclopedia > Lucene
Lucene
Lucene logo
Developed by Apache Software Foundation
Latest release 2.3.1 / February 23, 2008
Written in Java
OS Cross-platform
Genre Search and index
License Apache License 2.0
Website lucene.apache.org

Lucene is a free/open source information retrieval library, originally implemented in Java. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene has been ported to programming languages including Delphi, Perl, C#, C++, Python, Ruby and PHP. Image File history File links Lucene_logo_green_300. ... For other uses, see Software developer (disambiguation). ... Apache Software Foundation Logo The Apache Software Foundation (ASF) is a non-profit corporation (classified as 501(c)(3) in the United States) to support Apache software projects, including the Apache HTTP Server. ... Code complete redirects here. ... is the 54th day of the year in the Gregorian calendar. ... 2008 (MMVIII) will be a leap year starting on Tuesday of the Anno Domini (common) era, in accordance with the Gregorian calendar. ... A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. ... Java language redirects here. ... An operating system (OS) is the software that manages the sharing of the resources of a computer and provides programmers with an interface used to access those resources. ... A cross-platform (or platform independent) programming language, software application or hardware device works on more than one system platform (e. ... In computer science, a search algorithm, broadly speaking, is an algorithm that takes a problem as input and returns a solution to the problem, usually after evaluating a number of possible solutions. ... Index has two distinct meanings in computer science: an integer which identifies an array element, and a data structure which enables sublinear-time lookup. ... A software license is a legal agreement which may take the form of a proprietary or gratuitous license as well as a memorandum of contract between a producer and a user of computer software. ... The Apache License (Apache Software License previous to version 2. ... A website (alternatively, Web site or web site) is a collection of Web pages, images, videos or other digital assets that is hosted on one or several Web server(s), usually accessible via the Internet, cell phone or a LAN. A Web page is a document, typically written in HTML... Free software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions only to ensure that further recipients can also do these things. ... ... Information retrieval (IR) is the science of searching for information in documents, searching for documents themselves, searching for metadata which describe documents, or searching within databases, whether relational stand-alone databases or hypertextually-networked databases such as the World Wide Web. ... Illustration of an application which may use libvorbisfile. ... Java language redirects here. ... Apache Software Foundation Logo The Apache Software Foundation (ASF) is a non-profit corporation (classified as 501(c)(3) in the United States) to support Apache software projects, including the Apache HTTP Server. ... The Apache Software License is an open source license used by the Apache Software Foundation. ... Object Pascal is an object oriented derivative of Pascal mostly known as the primary programming language of Borland Delphi. ... Wikibooks has a book on the topic of Perl Programming Perl is a dynamic programming language created by Larry Wall and first released in 1987. ... The title given to this article is incorrect due to technical limitations. ... C++ (pronounced see plus plus, IPA: ) is a general-purpose programming language with high-level and low-level capabilities. ... Python is a high-level programming language first released by Guido van Rossum in 1991. ... Ruby is a reflective, dynamic, object-oriented programming language. ... For other uses, see PHP (disambiguation). ...


While suitable for any application which requires full text indexing and searching capability, Lucene has been widely recognized for its utility in the implementation of Internet search engines and local, single-site searching. Lucene itself is just an indexing and search library and does not contain crawling and HTML parsing functionality. The Apache project Nutch is based on Lucene and provides this functionality; the Apache project Solr is a fully-featured search server based on Lucene. See WebCrawler for the specific search engine of that name. ... A parser is a computer program or a component of a program that analyses the grammatical structure of an input, with respect to a given formal grammar, a process known as parsing. ... Nutch is an effort to build an open source search engine. ... Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. ...


At the core of Lucene's logical architecture is the idea of a document containing fields of text. This flexibility allows Lucene's API to be independent of file format. Text from PDFs, HTML, Microsoft Word documents, as well as many others can all be indexed so long as their textual information can be extracted. “PDF” redirects here. ... HTML, an initialism of Hypertext Markup Language, is the predominant markup language for web pages. ... Microsoft Word is Microsofts flagship word processing software. ...

Contents

Software using Lucene

  • OpenGrok A fast and usable source code search and cross reference engine
  • Puggle A graphical desktop search engine that uses Lucene for full-text and metadata search [1]
  • isoHunt Uses Lucene for the site search [2]
  • Gplex Database (homepage) Uses a C# version of Lucene to index your database metadata.
  • EB-eye_EBI's_Search_Engine EMBL-EBI's Biomedical databases search engine (contains than 200 million documents)
  • Joost Internet TV uses Lucene to search for programs.
  • MediaWiki can use Lucene for full-text search.
  • QuestAgent offline search engine uses Lucene in applet that provides full text search.
  • Liferay open source portal, uses Lucene for full-text search.
  • Beagle uses a port of Lucene to C# called Lucene.Net as its indexer.
  • Daisy uses Lucene for site search.
  • Merobase Component finder creates its index with Lucene
  • db4o works in combination with Lucene to support full text search.
  • Digg [3]
  • Docco (homepage) uses Lucene for desktop search.
  • DSpace (homepage) uses Lucene.
  • CNET uses Lucene to search their product category listings.
  • LjFind uses Lucene to search over 110,000,000 LiveJournal posts.
  • Red-Piranha [4] is another Lucene based search engine. It is ready to use, deployable as a GUI, command line or Tomcat web application, and has the ability to "learn" what the user wants.
  • The Flock web browser uses Clucene, a C++ version, to do a full text search of browser history.
  • KnowledgeBase [5] - A service focused CRM platform which uses the Lucene search engine
  • Zimbra groupware incorporates Lucene.
  • Ants P2P is using Lucene for the search option, within this anonymous file sharing program.
  • MMBase has an expansion that uses Lucene for indexing its data.
  • Alfresco,[6] a free/open source Enterprise Content Management system
  • Strigi [7] uses CLucene, a C++ version, to index and search the desktop.
  • Midgard uses Lucene for its indexing and full-text search
  • Nuxeo EP,[8] a free/open source Enterprise Content Management (ECM) platform
  • Local Lucene,[9] a Geographical based searching solution using Lucene
  • Perst, an open source, object-oriented embedded database, integrates with Lucene for full-text database indexing and searching and for ACID-compliant transactional protection of the Lucene index
  • judy's book [10] uses Solr Lucene.
  • MindTouch Deki Wiki,[11] a free open source wiki and application platform, employs dotLucene for indexing wiki pages and file attachments.
  • LoopTeK Search Internet Video content search.
  • Scalix is using Lucene for their Search and Indexing Service (SIS), available in version 11 of Scalix.
  • panFMP [12] is a generic and flexible framework for building metadata portals independent of metadata formats and protocols. As panFMP was developed specifically for Spatial Data Infrastructures, Lucene was extended by performant trie-based range-queries.
  • Jira [13] is a popular issue tracking system.
  • VYRE Unify [14] Content management platform
  • GeoNetwork opensource [15] is a standards based, Free and Open Source catalog application to manage spatially referenced resources using Lucene.
  • Tesco Healthy Living Tracker [16] uses a port of Lucene to C# called Lucene.Net for its search functionality.
  • OpenCms opensource content managenet system

A more extensive list of software that uses Lucene is in the PoweredBy page of Lucene's wiki. OpenGrok is a fast and usable source code search and cross reference engine. ... Puggle can refer to: A crossbred dog (Pug and Beagle). ... isoHunt is a major BitTorrent index with over 806,961 torrents in its database and over 12. ... // The European Bioinformatics Institute (EBI) is a non-profit academic organisation that forms part of the European Molecular Biology Laboratory (EMBL). ... For other uses, see Joost (disambiguation). ... This article is about the wiki software. ... Liferay, Inc. ... Old UI for Beagle, code named Best Beagle is a search tool for Linux, enabling the user to search documents, chat logs, email and contact lists in a similar way to Spotlight in Mac OS X, or Google Desktop under Microsoft Windows. ... The title given to this article is incorrect due to technical limitations. ... Daisy is a Java/XML open-source content management system based on the Apache Cocoon content management framework. ... db4o (database for objects) is a high-performance, embeddedable object database for Java and . ... Digg is a community-based popularity website with an emphasis on technology and science articles, recently expanding to a broader range of categories such as politics and entertainment. ... This article is in need of improvement. ... This article or section does not cite its references or sources. ... DSpace is an open source software package which provides the tools for management of digital assets, and is commonly used as the basis for an institutional repository. ... CNET Networks, Inc. ... LiveJournal (often abbreviated LJ) is a virtual community where Internet users can keep a blog, journal, or diary. ... GUI can refer to the following: GUI is short for graphical user interface, a term used to describe a type of interface in computing. ... A command line interface or CLI is a method of interacting with a computer by giving it lines of textual commands (that is, a sequence of characters) either from keyboard input or from a script. ... Apache Tomcat is a web container, or application server developed at the Apache Software Foundation (ASF). ... In software engineering, a web application is an application delivered to users from a web server over a network such as the World Wide Web or an intranet. ... Flock is a web browser heavily based upon Mozilla Firefox and other Mozilla technologies. ... An example of a Web browser (Mozilla Firefox) A web browser is a software application that enables a user to display and interact with text, images, videos, music and other information typically located on a Web page at a website on the World Wide Web or a local area network. ... Zimbra Collaboration Suite (ZCS) is a groupware product created by Zimbra Inc. ... ANts P2P is an anonymous peer-to-peer open source file sharing software written in Java. ... Alfresco is an open source, open standards, enterprise scale content management system that includes a modern content repository, an out-of-the-box web portal framework for managing and using content designed to work with standard portals, and a Common Internet File System (CIFS) interface that provides Microsoft Windows file... Strigi is an advanced, desktop independent search daemon initiated by Jos van den Oever. ... The Midgard Project Midgard CMS is an Open Source Content management system built on the Midgard Framework. ... Enterprise Content Management (ECM) is any of the strategies and technologies employed in the information technology industry for managing the capture, storage, security, revision control, retrieval, distribution, preservation and destruction of documents and content. ... Perst is an open source, dual license, object-oriented embedded database management system (ODBMS), available in two implementations: one that is developed entirely in the Java programming language, and another developed in the C# language (for applications that will run within the Microsoft . ... For other uses, see Acid (disambiguation). ... Deki Wiki is free open source wiki software developed and commercially supported by MindTouch. ... Scalix is an e-mail and groupware server that runs on Linux. ... A Spatial Data Infrastructure or SDI is a framework of spatial data, metadata, users and tools that are interactively connected in order to use spatial data in an efficient and flexible way. ... A trie for keys to, tea, ten, i, in, and inn. In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. ... VYRE is an enterprise content management system developed by VYRE Ltd. ... The title given to this article is incorrect due to technical limitations. ... OpenCms is an open source content management system based on Java and XML technology. ...


Ports

Lucene has been ported or is in the process of being ported to various programming languages other than Java: In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed (e. ...

Apache Software Foundation Logo The Apache Software Foundation (ASF) is a non-profit corporation (classified as 501(c)(3) in the United States) to support Apache software projects, including the Apache HTTP Server. ...

References

See also

Image File history File links Free_Software_Portal_Logo. ... Hadoop is a collection of Free Java software previously developed by the Nutch project but now maintainted by Lucene[1]. The system includes a distributed filesystem reminiscent of GoogleFS named the Hadoop Distributed File System (or just DFS[1]), a clone of MapReduce called HadoopMapReduce[2] and a few other... Nutch is an effort to build an open source search engine. ... Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. ...

External links

Chris Conrad is an American actor. ... Apache Software Foundation Logo The Apache Software Foundation (ASF) is a non-profit corporation (classified as 501(c)(3) in the United States) to support Apache software projects, including the Apache HTTP Server. ... Apache ActiveMQ is an open source (Apache 2. ... Apache Ant is a software tool for automating software build processes. ... The Apache HTTP Server, commonly referred to simply as Apache, is a web server notable for playing a key role in the initial growth of the World Wide Web. ... The Apache Portable Runtime (APR) is a supporting library for the Apache web server. ... Apache Beehive is a Java Application Framework designed to make the development of Java EE based applications quicker and easier. ... Apache Camel is a powerful rule based routing and mediation engine which provides a POJO based implementation of the Enterprise Integration Patterns using an extremely powerful fluent API (or declarative Java Domain Specific Language) to configure routing and mediation rules. ... Apache Cayenne is an open source persistence framework licensed under the Apache License, providing object-relational mapping (ORM) and remoting services. ... Apache Cocoon, often just called Cocoon, is a web development framework built around the concepts of separation of concerns and component-based web development. ... The Apache Commons is a project of the Apache Software Foundation, formerly under the Jakarta Project. ... Apache Derby is a Java-based Relational Database Management System that can be embedded in Java programs and used for online transaction processing (OLTP). ... The Apache Directory Server is an open source project of the Apache Software Foundation. ... The Apache Excalibur project produces a set of libraries for component based programming in the Java language. ... Felix is an open source implementation of the Open Services Gateway Initiative (OSGi) Release 4 framework. ... Apache Forrest is a web-publishing framework based on Apache Cocoon. ... The Geronimo project is an open source application server developed by the Apache Software Foundation and distributed under the Apache license. ... This article or section is not written in the formal tone expected of an encyclopedia article. ... Apache Harmony is an open source implementation of Java, starting with Java SE 5. ... Hivemind is a computer software framework, written in Java. ... iBATIS is a persistence framework which enables mapping sql queries to POJOs (Plain Old Java Objects). ... The Jackrabbit Project has been formed to develop an open source implementation of the Content Repository for Java Technology API (JCR), being specified within the Java Community Process as JSR-170. ... Apache Lenya is a Java/XML open-source content management system based on the Apache Cocoon content management framework. ... For other uses of the word Maven see: Maven (disambiguation) Maven is a software tool for Java programming language project management and automated software build. ... mod_perl is an optional module for the Apache web server. ... Apache MyFaces is an open-source JavaServer Faces implementation developed by the Apache Software Foundation. ... Apache Open For Business (Apache OFBiz) is Open Source automation software that is an Apache Top Level Project. ... OpenEJB is an open source, modular, configurable, and extendable EJB Container System and EJB Server, released under the Apache 2. ... OpenJPA is an open source implementation of the Java Persistence API specification. ... Apache POI, a project run by the Apache Software Foundation, and previously a sub-project of the Jakarta Project, provides pure Java libraries for reading and writing files in Microsoft Office formats, such as Word, PowerPoint and Excel. ... Roller Weblogger is a Java-based, full-featured, Multi-blog, Multi-user, Open Source weblog server. ... Shale is a web application framework maintained by the Apache Software Foundation. ... SpamAssassin is a computer program released under the Apache License 2. ... For the part of a car, see strut or suspension (vehicle). ... Tapestry is a Java-based programming toolkit that uses XML to implement applications in accordance with the model-view-controller design pattern. ... Apache Tomcat is a web container, or application server developed at the Apache Software Foundation (ASF). ... This article is about Velocity as template engine. ... Wicket is a web application framework for the Java programming language that reached version 1. ... XMLBeans is a Java-to-XML binding framework which is part of the Apache Software Foundation XML project. ... The Jakarta Project creates and maintains open source software for the Java platform. ... Lucene is a free open source, information retrieval API originally implemented in Java by Doug Cutting. ... The Apache XML project is part of the Apache Software Foundation and focus on XML-related projects. ... Apache Incubator is the gateway for projects hoping to become fully fledged Apache Software Foundation projects. ... The Byte Code Engineering Library (BCEL) is a project sponsored by the Apache Foundation under their Jakarta charter to provide a simple API for decomposing, modifying, and recomposing binary Java classes (I.e. ... The Bean Scripting Framework is a method of allowing the use of scripting in Java code. ... Cactus is a simple test framework for unit testing server-side java code (Servlets, EJBs, Tag libs, ...) from the Jakarta Project. ... JMeter is an Apache Jakarta project that can be used as a load testing tool for analyzing and measuring the performance of a variety of services, with a focus on web applications. ... Slide is an open-source content management system from the Jakarta project. ... Xerces is a family of software packages for parsing and manipulating XML, part of the Apache XML project. ... Batik is a pure-Java library that can be used to render, generate, and manipulate SVG graphics. ... FOP (Formatting Objects Processor) is an XSL-FO processor written in Java, which provides the feature to convert XSL-FO files to PDF or direct-printable-files. ... log4j is a Java-based logging utility. ... XAP (eXtensible Ajax Platform) is an XML-based declarative framework for building interactive Ajax web applications. ... Jini (pronounced like genie and also called Apache River) is a network architecture for the construction of distributed systems in the form of modular co-operating services. ... Apache ServiceMix is an open source distributed enterprise service bus (ESB) and service-oriented architecture (SOA) toolkit. ... log4net is a port of the log4j logging framework to the Microsoft . ... Abdera is an implementation of the IETF Atom Syndication Format and Atom Publishing Protocol Proposed Standards. ... Apache Ivy is a transitive dependency manager currently being developed in the Apache Incubator. ... CXF is an open-source Web Services frameworks developed by the Apache Software Foundation. ... Hadoop is a collection of Free Java software previously developed by the Nutch project but now maintainted by Lucene[1]. The system includes a distributed filesystem reminiscent of GoogleFS named the Hadoop Distributed File System (or just DFS[1]), a clone of MapReduce called HadoopMapReduce[2] and a few other... The Apache License (Apache Software License previous to version 2. ...

  Results from FactBites:
 
Apache Lucene - Overview (448 words)
Lucene will once again be well represented at ApacheCon USA in Atlanta this November 12-16, 2007.
Lucene 2.2 includes index format changes that are not readable by older versions of Lucene.
Lucene 2.1 includes index format changes that are not readable by older versions of Lucene.
Welcome to Lucene! (824 words)
Lucy is a loose C port of Lucene Java, with Perl and Ruby bindings.
Lucene projects will be well represented at ApacheCon Europe in Amsterdam this year.
The Lucene PMC has approved the Lucy sub-project, and now that the mailing lists, repositories, and such are in place, we're rolling up our sleeves and getting to work.
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.