Episode 117: Full Text Search episode artwork

EPISODE · Apr 19, 2011 · 34 MIN

Episode 117: Full Text Search

from Faceoff Show · host Faceoff Staff

Add enterprise level search into your site. News and Follow/Ups – 01:00 Square now being sold in Apple’s store Check-Ins dying out? Dropbox: 25 million users Geek Tools – 14:13 Yikerz! – Super fun magnet game Webapps – 16:12 Surfboard – Flipboard as a web app InstaLyrics – Find lyrics quickly Full Text Search – […]

Add enterprise level search into your site. News and Follow/Ups – 01:00 Square now being sold in Apple’s store Check-Ins dying out? Dropbox: 25 million users Geek Tools – 14:13 Yikerz! - Super fun magnet game Webapps - 16:12 Surfboard - Flipboard as a web app InstaLyrics - Find lyrics quickly Full Text Search - 22:11 Options Google Custom Search Commercial Benefits Super fast to setup Easy to implement Ability to add adsense into search results Downsides Unable to adjust content ranking and do custom integration Mainly for just indexing HTML pages, not search queries and other text. Sphinx “Searching via SphinxAPI is as simple as 3 lines of code, and querying via SphinxQL is even simpler, with search queries expressed in good old SQL.” Open source with commercial support Result relevance ranking is the default. You can set up your own sorting should you wish, and give specific fields higher weightings. The search service daemon (searchd) is pretty low on memory usage - and you can set limits on how much memory the indexer process uses too. API for: Java, PHP, Python, Ruby, Perl, C, and other languages. Written in C++ Stats 60+ MB/sec per server 500+ queries/sec Biggest known Sphinx cluster indexes 5 billion documents, resulting in over 6 TB of data. Busiest known one is, unsurpisingly, Craigslist, that serves 50+ million search queries/day. Companies using Sphinx Craigslist Slashdot Mozilla Wordpress.org Lucene Done by the Apache foundation Open source Written in Java Search types ranked searching -- best results returned first many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more fielded searching (e.g., title, author, contents) date-range searching sorting by any field multiple-index searching with merged results allows simultaneous update and searching Stats over 95GB/hour on modern hardware small RAM requirements -- only 1MB heap index size roughly 20-30% the size of text indexed Solr Lucene is a library where Solr is a server that supports XML, REST Benefits over Sphinx Solr is easily embeddable in Java applications. Solr can be integrated with Hadoop to build distributed applications Solr can index proprietary formats like Microsoft Word, PDF, etc. Sphinx can't. Companies using Solr eHarmony Ticketmaster Digg AOL Zappos

NOW PLAYING

Episode 117: Full Text Search

0:00 34:26

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

No similar episodes found.

MG Show MG Show The MG Show, hosted by Jeffrey Pedersen and Shannon Townsend, is a leading alternative media platform dedicated to uncovering the truth behind today’s most pressing political issues. Launched in 2019, the show has grown exponentially, offering unfiltered insights, comprehensive research, and real-time analysis. With a commitment to independent journalism and factual integrity, the MG Show empowers its audience with knowledge and encourages active participation in the political discourse. Breaking News Show | eTurboNews Juergen Thomas Steinmetz News is relevant to the global travel and tourism industry, human rights and global issues.Breaking news when it happens and only from the source. PodQuesting Dwight J Randolph- WolfShield Media PodQuesting: -By WolfShield Media and Dwight J RandolphJoin us on an exciting journey to master the world of fiction podcasting! At PodQuesting, we document our quest to improve and innovate, sharing valuable insights, strategies, and behind-the-scenes tips along the way. Whether you're an experienced podcaster or just starting your first show, our podcast is your go-to resource for everything podcasting.Discover practical advice, creative techniques, and lessons from our own experiences as we explore the ever-evolving podcasting landscape. Ready to level up your skills and embark on this adventure with us? Tune in and join the quest!Have questions or feedback? Reach out to us at [email protected] and visit our website:WolfShield.Media Denn sie wissen was sie wandern Manuel Andrack Alles über Premiumwanderwege, die schönsten Wege in Deutschland. Sensationelle Outdoor-Erlebnisse auf 750 Premiumwegen. Moderiert von Manuel Andrack (Sidekick der Harald Schmidt Show) und Klaus Erber (Vorsitzender des Deutschen Wanderinstituts.)

Frequently Asked Questions

How long is this episode of Faceoff Show?

This episode is 34 minutes long.

When was this Faceoff Show episode published?

This episode was published on April 19, 2011.

What is this episode about?

Add enterprise level search into your site. News and Follow/Ups – 01:00 Square now being sold in Apple’s store Check-Ins dying out? Dropbox: 25 million users Geek Tools – 14:13 Yikerz! – Super fun magnet game Webapps – 16:12 Surfboard – Flipboard as...

Can I download this Faceoff Show episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!