October 11, 2004
SuSE, MySQL bring navigation revolution to cancer support site
The users of CancerSupportiveCare.com kept getting lost on the site, despite the site's adherence to Internet navigation conventions. So, the site's design team set about creating an unconventional, and yet old-fashioned, navigational system using SuSE Linux and a MySQL database. Webmaster Alexandra Andrews believes the system could cure the design chaos that plagues the World Wide Web.
CancerSupportiveCare.com's mission of providing cancer support information tailored to individuals struck a nerve. Once an outlet for cancer supportive care information was created, information and users began streaming in. Launched in 1999, the site had 15 pages, hundreds of hits, and visitors from seven countries. In September, 2004, it had 35 portals, hundreds of pages, millions of hits and visitors from 134 countries worldwide.
Making this huge library of information accessible to people with disabilities and easy to navigate posed a big problem for the site's design team: Andrews, Neil Dunlop, Michael McMillan, Steve Stilson and Ernest H. Rosenbaum, M.D. In doing research on the issue, they found that information overload is a problem that's plaguing the Internet in general.
"There is a crisis in this presentation of information," said Andrews. As websites get larger and larger, they become more and more difficult to navigate. There's no standard for naming Web pages or for how Web pages are organized or how information is sorted. Some sites have site maps, but a map covering more than 15 pages becomes unwieldy. "Pages cluttered with graphics and links or maps with you are here are other navigational attempts," she said.
The team found that site search engines don't always give users the information they want. For example, a search for specific cancer information, such as Post Breast Therapy Pain Syndrome, brings up more than 100 pages because the word pain is found in so many pages.
The Cancer Supportive Care site developers decided to adopt a new way of organizing Web pages to address the above problems. "The Internet is the largest library in the world," said Andrews. "Why not adopt Library of Congress (LOC) Classification standards for organizing websites?"
The project has just been completed. Now, the site's Web pages are numbered and organized according to LOC categories. For example, the cutter number uses the last initial and author number for the first author of the article. The second cutter number separates Web pages with the same author and category.
To set up this system, the site team wrote a searchable database -- using SuSE Linux, MySQL and PHP -- that brings up the page, the title, authors, the date the page appeared and the date the page was updated. The site map is a flat HTML page organized using LOC classification with descriptions similar to the old library card catalog model.
The site has resided on SuSE Linux for years, and it's proven to be easy to use, reliable, well-supported and robust, Andrews said. Along with all that, the distribution's accessibility features sold the team on using SuSE 9.1 for this project.
"Accessibility is a keystone of our website," said Andrews. With SuSE, the Braille display enables the blind and visually impaired to use a computer that is equipped accordingly. "Since the release of SuSE Linux 7.0, visually impaired users can not only work with Linux, they can even install their own systems," she said.
Cost, reliability and security drove the team's decision to use open source software in almost all instances.
"No viruses and blue screens of death," said Andrews. "If the original source code isn't secret, that takes away one big incentive to write worms. You are not a Windows galleon laden with treasure sailing the Internet seas waiting for privateers and pirates to board."
As for cost, buying an off-the-shelf, proprietary database and Web-creation software would have been very expensive. "This is one of the hottest fields in programming at the moment, so full-blown software that does these things is the most expensive on the market," said Andrews. "Writing our own eliminated these large costs."
On the flip side, using open source required more labor and time than using proprietary software. "I don't want to imply that using open source was completely easy," said Andrews. For example, the team had to write all the PHP using the vi text editor and other simple text editors and then translate the data manually from the MySQL tables. The good news is that doing this gave the team a better understanding of the code. "So, we were able to fix glitches ourselves," said Andrews.
One such glitch concerned filling the author field. Most of the articles on the site have multiple authors. The team used PHP as the go-between between HTML and the MySQL database of keywords. PHP took the search terms from the fields of the search form and then issued MySQL commands to retrieve entries based on the search fields.
"This was especially difficult when searching for authors, because each article had multiple authors, which is stored in a third table," said Andrews. "In other words, there is one table for articles, one for authors and one for order of authors in the article. The search term had to be translated between three different tables." Once that was set up, the PHP could create the HTML which would show each result, result count, error message, or status of each search.
Now, the search page can search for articles in the database by author, title, keyword(s), description and/or Library of Congress number. It returns articles sorted by the most recently-modified first. If a user enters search terms for more than one field, it searches for one, then the other, appending the results to the screen. You can click on each result to go to the full article's text.
The idea behind the site's database is to be able to search articles by keyword(s) rather than full text. This prevents a lot of extraneous results that don't really deal with that search term as their primary topic. With this database, the team addressed users' most common concerns about websites: Is the information on this web page credible? Who wrote it? When was it written? Has the information been revised?" The last question is critical, Andrews said, because current information is imperative in patients' medical research.
By addressing users' needs with this project, the CancerSupportiveCare.com designers have begun their new mission: providing a model for new Web-based medical projects. "We are writing a piece on how to use computers with cancer-related therapy issues," said Andrews. "We are pioneering doing medical research using the Web." Currently, the team is working with database vendors to find a new way of capturing and presenting data from medical information questionnaires.
In the meantime, the team is enjoying the rewards of a job well done. In the few weeks since the new system has been up, many users have reported that their "lost-in-navigation" problems are cured. "The majority of our site visitors know how to use a library," said Andrews. "By adhering to LOC standards, CancerSupportiveCare.com becomes organized on a familiar system."
Reprinted by permissionMore about this revolutionary idea
06 Oct 2004 | SearchEnterpriseLinux.com
Crisis on the World Wide Web: A Library Website Model