|
|
TN0017: TADB - Some Notes on my Technical Abbreviations Database
This note explains the origin and the inner workings of my Technical
Abbreviations Database and makes a snapshot of it available as a ZIP file.
Introduction
As a techie, I'm always running across new abbreviations when reading magazines and articles. So,
in December 2000, I decided to start collecting them and make this collection available
online through a search page on my web site. Three years later, there was a wish to make
the whole collection available as a downloadable file so that other (open source)
projects can make use of it.
Technical Details
The abbreviations are stored on my server in a table in a MySQL database. The table
contains eight columns:
-
AID: Each abbreviation gets assigned a unique number, the abbreviation ID. This
is needed for two reasons: each table in a relational database needs to have a
primary key and my 10+ years' experience with databases tells me to always us a synthetic
primary key. The real world changes too much and things that you think will stay the same
over time, like ZIP codes or customer ids, actually do change. The first happened in the 90's
in Germany and the second happens whenever companies merge.
-
UID: For each abbreviation, I store the user id of the user who entered it into
the database. If there is ever a question about a certain entry in the database, this
information will help resolve the issue and will also help in finding other entries made
by the same user.
-
ABKTEXT: Textual representation of the abbreviation. Can be upper case, lower case
and may contain spaces or punctuation. This representation is used for display to humans.
-
SUCHTEXT: This is the internal, normalized representation of the abbreviation. All
white space and punctuation has been removed and all letters have been changed to
upper case. This column is used for searching for abbreviations.
-
LANGTEXT: This is the full text.
-
SPRACHE: This is the language. I use "EN" for English and "DE" for German.
-
BEMERKUNG: This field may contain a short comment or URL.
-
DATUM: This field contains date and time of entry.
The front end to this database consists of two PHP scripts, one for entry of new
abbreviations, which is only available for logged-in users, and the other
for searching the TADB, which is available to the general public. You can
access it by going to
http://my.kuckuk.com/tadb.php
Downloadable Version
In December 2003, it became clear this collection might be useful to others in the
Open Source / Free Software world, so I created a snapshot of the database with
850 entries in
SQL, CSV, and TXT format. This snapshot is available under the
Open Software License 2.0 (OSL-2.0)
and you can download it from here:
http://www.kuckuk.com/public/downloads/tadb/tadb20031218.zip
I picked this license because it seems to prevent the most extreme forms of abuse
of my work. If you prefer a different license, feel free to contact me.
Future Work
I continue to enter new abbreviations into my database and from time to time, I will
make snapshots available for download here. I also intend to create an XML-RPC and a SOAP
front end.
Document History
First Version: December 31, 2003
Questions?
If you have any questions, please send e-mail to
Carsten Kuckuk
at
.
|