Archive for category Website Development

FIRSTSearch Released

After various delays of the server-is-down sort, I’ve finally finished fixing the indexing software, and FIRSTSearch is ready for (beta) release. It’s by no means an incredible piece of software – it has some bugs, it wants some features, the display of results is less than perfect, and I have my doubts that it’s really finding the most relevant results. But it’s working, and it’s ready forr other people to start to use.

I expect to have a non-beta release within a few weeks.

Tags: ,

No Comments

Migrating MySQL to PostgreSQL By Hand

I recently completed a manual migration of my data for CTDA (an analysis system for team data collected from thebluealliance.net and other sources) to PostgreSQL from MySQL (see my comparison for why). By ‘manual’, I do not mean that I typed the data in by hand – I mean that for each table, I wrote a separate script to transfer the data, instead of writing (or finding online) a single script to copy a whole database. My advice to you, if you are planning to migrate data to PostgreSQL from MySQL, is this: don’t do it by hand.

Let’s assume I would have been unable to find a workable script online to transfer the database for me. This is unlikely at best, but if I couldn’t, I could probably have written one that would cover all the cases needed for my database in about 3 or 4 hours (I’m rather new to postgres). So I sit down one day, write the script, test it a bit, and set it to run overnight, and I’m done.

But that’s not what I did. I decided that it would be ‘easier’ to write a script for each table. Perhaps it saved me some thinking – but the process ended up taking me two weeks, and much more than 4 hours. I would estimate that, because of this delay, CTDA will be released 3 weeks later than it would have been if I had written a good script to do this for me.

Oh, and in case you can’t guess. My advice to those who are planning to migrate from PostgreSQL to MySQL: don’t.

Tags: , , , ,

No Comments

MySQL vs PostgreSQL: Benchmarking Data

After looking into migrating to PostgreSQL, which seems to be a popular pastime among database people (migrating, not looking into), I decided to do my own benchmarks. Here are the results of the simple ones (I have yet to code the complex ones). I wrote all the code in perl, and ran it on quentin. I use InnoDB for mysql, with defaults, and everything default on postgres.

Read the rest of this entry »

Tags: , , , ,

2 Comments

New Additions to Website

We’ve been trying to get quality content up on the website, emphasizing resources for robotics and programming. I put together version 1 of an article system, which can split a long page into multiple sections based on header tags, and automatically generate a table of contents. We’ve written a few new articles, but we’ve mostly just been trying to get old content online. The current article list can be viewed here. Most recently, I put up Eric’s Subversion Crash Course.

We’ve also been working on a new index page, which would look somewhat better, have more content, and be more inviting overall.

Finally, we should have breadcrumb navigation soon.

Tags: , , ,

No Comments

Writing a Search Engine – Part 3

Note: this article is a continuation of previous articles on search engines (part 1, part 2).

After testing out an alpha version of my search engine for a while, I realized that its greatest flaw (other than printing out the results in a downright ugly format) was that it couldn’t recognize “programs” as a variant of the word “program”. I briefly considered programming it to automatically check for a limited set of common variants, but I decided that this was probably too much effort for what would be a decidedly low-quality result. I needed to find a list of, for every word, all of its variants.

What I was looking for (I discovered after about 45 minutes of IRC chat, google, and man pages) was the ispell english dictionary. ISpell dictionaries have a list of ‘roots’, and then, for every root, they have a list of flags that describe how that root can be transformed to form valid words. I could enter this information, via perl script, into a mysql table, and then retrieve it quickly both while indexing and while searching.

Read the rest of this entry »

Tags: , , , , ,

No Comments

Creating Basic Breadcrumb Navigation with PHP

“Breadcrumb navigation” is the feature on many websites (including ours, now), where there is a line of links showing the position of the current page in the overall hierarchy. This often corresponds directly to the URL. For example, if the URL was http://robot.mbhs.edu/resources/web/html, the breadcrumb bar might display something like Team 449 >> Resources >> Website Development >> HTML (except with links). This is not only useful to users, since they can have more navigational tools available, but search engines also like it, because it results in a lot of links to your site within your site. As usual, I’ve put the source code used in the blair robot project website below.

Read the rest of this entry »

Tags: , ,

2 Comments

New UI Uploaded on Website

After several months of waiting, previewing, fixing, and starting over, the new look of the Team 449 website is almost finished. The beta version has been uploaded, although work continues on it. The main changes that will be made soon are:

  • typesetting
  • the left navbar
  • add statusbar

Improvements are still being made and bugs are still being ironed out, but for now, the site validates as XHTML 1.0 transitional.

It is our intent to eventually create an HTML 5 version that can be viewed by compatible browsers. (CSS3 will be included no matter what browser is being used, since it causes no adverse effects.)

Tags: , , , ,

1 Comment

Writing a Search Engine – Part 2

Note: this article is a continuation of a previous article on search engines, and has been continued with part 3.

After a bit over a week coding (and learning various Perl libraries), I have completed stages 1 and 2 of the search, although stage 2, the indexer, could do with a little improvement. Both are written in perl, and as usual, the complete code listings are below. I decided to write the entire spider and indexer in perl and optimize as necessary later on, so that I could get done with the thing and not get bogged down in C code. If the perl turns out not to be fast enough as the site grows, then I plan to port to C. Likewise, the actual search part (stage 3) will be written in PHP to save time. If the PHP is not fast enough, I’ll rewrite it in C – but I expect there to be no problems.

Read the rest of this entry »

Tags: , , , , , , ,

No Comments

Writing a Search Engine – Part 1

I decided that the website had grown to the point where it needed a search engine. I didn’t want to use a google search or an embedded yahoo search – they look disgusting. I also didn’t want to use any of the third party searching scripts, since most of them were costly and all of them had commercial licenses. I like free software. So I set out to write my own.

General Considerations

Let me start off by saying that what I have below is not a magic, easy solution to writing a search engine. If you are planning to write the world’s “next google”, I have a recommendation: go to http://bing.com – Microsoft’s “next google”. Notice how “copycatted” it looks. Then search around (on google, please) to find out exactly how popular it is. Hint: not very. Microsoft tried and failed. Don’t waste your time. My problem is to build an internal search engine, which only needs to deal with a small number of pages, and is low traffic so it doesn’t have to be super fast. When I told my co-working friends about the project, the responses I got varied from “maniac” to “shoot yourself now rather than afterwards – save some time”. And that’s with a highly simplified version of the problem.

Read the rest of this entry »

Tags: , , , , , , ,

1 Comment

Maintaining a Separate Draft Copy of a Website

On websites with traffic outside of a group of coworkers, you may find it desirable to modify a copy of the website, and then periodically upload the new version. Both the main robotics website and TMS (”Team Management System”) are developed apart from the main website, and then the drafts are periodically “pushed” onto the main site.

While this technique may not seem particularly impressive to some, some problems come up when you actually try to implement it. The most major problem is that when you push, all the links are now broken. A link to /draft/page.html needs to become /page.html when the page is pushed, and this is hard to automate. (A simple regexp is not enough: what about favicons and stylesheets?) The more minor problem that comes up is design-based, and depends on how you plan to store your pages. If you use a simple filesystem-oriented storage method, there will be no problem.

Read the rest of this entry »

Tags: , , , , , ,

1 Comment