Home > slashdot > Developing a Niche Online-Content Indexing System?

Developing a Niche Online-Content Indexing System?

July 17th, 2010 07:04 admin Leave a comment Go to comments

tebee writes “One of my hobbies has benefited for 20
years or so by the existence of an online index to all magazine
articles on the subject since the 1930s. It lets you list the
articles in any particular magazine or search for an article by
keyword, title or author, refining the search if necessary by
magazine and/or date. Unfortunately the firm which hosts the
index have recently pulled it from their website, citing security
worries and incompatibilities with the rest of their e-commerce
website: the heart of the system is an 20-year-old DOS program! They
have no plans to replace it as the original data is in an unknown
format. So we are taking about putting together
a team to build a open source replacement for this – probably using
PHP and MySQL. The governing body for the hobby has agreed to host
this and we are in negotiations to try and get the original data. We
hope that by volunteers crowd sourcing the conversion we will be
able to do what was commercially impossible.”
Tebee is looking for ideas about the best way to go about this, and for leads to existing approaches; read on for more.

tebee continues:

“It occurs to me that there could be
existing open-source projects that do roughly what we want to do —
maybe something indexing academic papers. But two days of trawling
through script sites and googling has not produced any results.

Remember that here we only point to the
original article, we don’t have the text of it online, though it has
been suggested that we expand to do this. Unfortunately I think
copyright considerations will prevent us from doing it, unless we can
get our own version of the Google book agreement!

So does anyone know of anything that
will save us the effort of writing our system or at least provide a
starting point for us to work on?”

Source: Developing a Niche Online-Content Indexing System?

Related Articles:

  1. Google Indexing In Near-Realtime
  2. SVG and The Indexing of Web Standards
  3. NY Times To Charge For Online Content
  4. Google Starts Indexing Facebook Comments
  5. Newzbin.com Usenet Indexing Trial Set To Begin Next Week
blog comments powered by Disqus