SearchYC RSS Feeds

by chengmi on 2008-05-27

For the past few weeks, we've been working on a new algorithm for our Hacker News web crawler. The objective has been to decrease the time our crawler takes to index new threads while ensuring that the whole database is reasonably up-to-date with regards to points, edits, and deletions. While we've made significant progress towards solving this problem, we're not quite at the point where we can reliably offer the feature that we really want to build: notifications.

So in the mean time, we've built the next best thing. Today, we're launching RSS feeds for SearchYC search results. While you won't get a friendly reminder when someone replies to your comments and submissions, you now have a way to track keywords across Hacker News. We've found the feeds to be extremely useful for following discussions of SearchYC on Hacker News. I simply do a search for "SearchYC", sort by date, and then save the RSS feed using the blue icon at the top of the page. Now I know when and what people are saying about our project.

Another feature that we launched recently was a Firefox plugin for integrating SearchYC into your browser. Unfortunately, our site went down shortly after the release, so not many people were able to download it. We were reminded of this debacle when Richard Atkinson wrote a SearchYC plugin of his own, which is available on his blog.

We also loved Gabriel Weinberg's Ask YC Archive wiki. His page does a great job of organizing interesting topics and it's such a great resource that we've linked to it from our own Ask YC Archive.

SearchYC 2.0!

by chengmi on 2008-04-23

While SearchYC remains one of the best third-party search engines for Hacker News, we've been plagued with various performance problems. Many of you may have noticed that searches are extremely slow--sometimes even too slow for the site to be useful.

Well, that's what happens when you try to design and build a search engine in a week. Back in December, we decided to use the acts_as_ferret Rails plugin for its dead simple setup and configuration. This turned out to be a big mistake, as we soon found that Ferret (a Ruby port of Lucene) is surprisingly unstable in a production environment using DRb.

We stuck with Ferret for awhile because we weren't sure how many people would be using our website (the project started as a simple database for finding interesting things to read). Now that we're getting pretty regular traffic (which still amazes me, since we've spread largely by word-of-mouth), we figured it was time for a change.

So what we've done is essentially rebuilt SearchYC from the ground up. What began as a change in search technology turned into a monster refactoring and update of the entire codebase. Here are a few of the changes we made:

  1. We ditched Ferret, and adopted Solr instead. Solr has the advantage of being based on Lucene, at the cost of running a Java virtual machine in the background. There are many other websites that use Solr for search.
  2. Upgraded from Rails 1.8 to Rails 2.0--not a huge change, but there is a slight performance boost in the newer version.
  3. Moved some features to subdomains. For example, http://searchyc.com/top/list is now http://top.searchyc.com. The legacy URLs are still supported through redirection.
  4. Changed the format of URLs. http://searchyc.com/by_date/query is now http://searchyc.com/query?sort=by_date. The old format seems much cleaner, but the new format has allowed us to do some pretty nifty things...
  5. Nifty things such as new search options! You can now search for things like titles, domains, comments, user submissions, user comments, etc.
  6. And by far my favorite new feature: search within results.
    Want to find all the times pg mentions Arc in a comment? No problem. Do a user search for pg's comments, and then click the "Within Results" link to search for "Arc".

So there you have it. We've also added a link at the top of the page for feedback (powered by Disqus), so please let us know what you think of these changes.

We'll be announcing a few more new features soon, so stay tuned...

SearchYC Presents: Ask YC Archive

by alaskamiller on 2008-02-20

One of the finest assets of an online community has always been the hodgepodge of expertise brought forth from a diverse group of people. As Hacker News grew beyond just a link bank, the community naturally turned inwards for consultation. Submissions “tagged” with Ask YC started popping up more and more and the response solicited were both interesting and honest.

So to help people get the most out of past discussions, we’ve compiled a list of Hacker News community threads. The Ask YC Archive features a bar graph for visualizing discussions over time based on points. We’ve also set up an RSS feed so you can keep track of new Ask YC threads in your favorite RSS reader. We hope these tools will allow our community to continually benefit from the wisdom of past discussions.

SearchYC Presents: Best of Hacker News

by alaskamiller on 2008-02-05

Today we’re launching another feature of SearchYC: Top Lists. Using our index of Hacker News threads and comments, we’ve compiled lists of the most interesting items to date.

So what did we find? Sticking true to the Web 2.0 mentality, TechCrunch articles were the most submitted, by far. Other Silicon Valley favorites such as Valleywag, Mashable, and Wired also made it to the top.

As for top users, nickb gets the honor with more than 1900 submissions while pg has more than 2000 comments. Go ahead and take a look for yourself, and see if you can find other trends in Hacker News activity.

The YC Bump

by alaskamiller on 2008-01-10

So how did we do on launch day? 554 uniques in the first 24 hours and 231 that flowed in after the party, so maybe about 700 people in all. Definitely not a slashdotting or a digging but quite a turnout nonetheless. As the days go on, hopefully quite a few stick around and find some value in our little project.