When dealing with large data sets, one strives to make accurate inferences
based on samples, often with the aim of making critical decisions based on
evidence, not just “educated guesses” or optimistic hindsight leading
to “we think we improved the situation.” How much did that new
taxonomy actually improve the search experience? How confident
are we that the new relevancy model lead to increased click-through
for those top 100 skus? How do we approximate our bandwidth
utilization on a 10 gigabit link without collecting every single packet?
Exploring Standard Error & CLT
Available Platforms
This is the second post in a 3-post series about scaling enterprise search. Today’s post will focus on the actual indexing engines and platforms that are commonly in use today, as well as provide some light overview of hardware requirements.
The search engine landscape is populated with both open source and proprietary platforms. In the open source world, the most prominent and widely used platform is Solr, which is built on the search library Lucene. Both of these projects are from the Apache Software Foundation and are commercially supported by Lucid Imagination. Other platforms such as Sphinx provide special integration with databases such as MySQL. One of the most common functions of a search engine is to provide high-volume, high-performance searching of information that is stored long-term in a database.
Continue reading →
From time to time the systems team here at Lightcrest builds custom packages for our clients that allow for easy, repeatable roll-outs of development or production environments. We keep a central version control repository (svn or git, depending on customer preference) of these packages in both their source and binary formats for easy administration and quality consistency.
This blog post will give a brief overview on how to build custom RPM packages that can then be installed onto multiple systems through your preferred package management and deployment utility. For the purpose of this document, we will be packaging the latest ruby (1.9.2-p0) for CentOS 5. Since we don’t want to override the CentOS provided ruby packages due to version conflicts and potentially breaking the provided ruby gems, we will be installing this into /opt.
Continue reading →
Introduction to BridgeBot
Hey guys, I thought I’d drop my first post with something potentially useful for folks out there who love to write python and happen to need protocol bridging for their chat systems. As you may or may not know, Lightcrest has a chat system in place that allows users to interact with our sales and engineering staff. Rather than purchase a third party application, we decided to build it ourselves so we could extend it in the future (also – why buy something when you can build it in four hours?).
Our chat system runs off a custom Flex app that talks to a custom ratbox IRC daemon. When the Flex app loads, it talks to our custom IRC daemon over a TCP socket and initiates, registers, and funnels messages as any other IRC client would.
Continue reading →
This is the first “real” post in the Lightcrest company blog. My name is Michael Hughes, one of the principals here at Lightcrest. Our blog will hopefully shed a bit of light into the day to day operations of our company as well as dive into some technical aspects of what we do for our clients. Since information technology is such a large and diverse field to work in, a lot of what we do can seem somewhat opaque and mysterious to people on the street (as it were), so this is our attempt at clarifying for our friends, colleagues, customers, and the rest of the world what we do on a day to day basis. Zach Fierstadt, another Lightcrest principal, will also be contributing to this blog, as will as some of the great folks on the technical and sales teams here at Lightcrest.
Continue reading →
Thank you for visiting Wavelengths, the Lightcrest company blog. In the coming days we hope to bring you additional information about our company, our employees, and the work we are doing for our clients in the public and private sectors.
Thanks for visiting – be sure to take a look around and check out the great services we offer.