File Format


File Format Information

last update: 12/21/17 see end of page

I have done several different reverse engineering projects for file formats I was interested in over the years. I am converting my old pages to Google Sites, and will add links and information as time allows. I find that Google Sites mangles my pages a bit when I upload them, and provide some comments on recovering the original pages if you are interested.

Early in 1990 I did some work on the *.TD0 teledisk format created by Sydex.
This work in conjunction with other people's work allows PDP-11 floppy disk images to be manipulated to recreate the functional physical DEC RX50 disks, or just to extract the original files. This page describes the Teledisk file format and methods for creating RX50 diskettes.

Between 2008 and 2010 I explored the Win9x backup file formats. Microsoft's early windows operating systems, specifically Win98 through WinME, included Backup programs. Unfortunately when WinXP was released Microsoft failed to provide a means of restoring the files from these backups. My pages documenting these file formats and the programs I wrote that can recover files from the archives are currently hosted at jacobytech.net. The owner of this site, Ryan Jacob, created it during the year long period my pages were off-line due to my ownership battle with the 1-and-1.com web hosting company. He recovered the pages using the Wayback Machine. Very enterprising! I will continue to host the *.lzh distribution archives which include the source code for these program on my Downloads page but see no point in duplicating the documentation Ryan has provided. Development on this project stopped back in 2010.


In 2017 I did some work on the Iomega 1-Step Backup file format.
1-Step Project Overview


Google Sites HTML Page Information

I appreciate that through Google Sites there is a free service to publish my HTML pages. I don't have much valuable intellectual property, and am pleased to be able to publish what is available at no cost. However I find using this service to post HTML pages to be a little tricky and find the available online help does not fully explain the system.

I started using what they call the 'classic' version in the late summer of 2016. I seem to have a workable system now after a significant amount of trial and error. If you read the help files on using the built in HTML editor it tells you that some standard HTML key words are not allowed. In particular the following which most of my original pages contained are not allowed:

<!--  comment  -->
<html>    </html>
<body>    </body>
<title>   </title>
If you try to paste in any of the above it will be ignored and you get a warning messge.
What is not so clear is that files with a *.htm or *.html extension are not treated the same way most other files are treated. Most pages (apparently /home is an exception) allow one to add pages from your local computer. Most files, ie *.jpg, *.lzh, *.tar, can then be accessed normally with a web browser. I expected to be able to add a *.htm file to a page and then open it in my browser of choice. This does not seem to be the case, instead there is an extra step where my browser, firefox in these trials, asks what I want to do with this file. It does offer the option of using the browser to open the file, but first copies it to my local drive and then opens it there. It appears the only way to get an HTML page to display normally in my broswer is to use the google sites 'create' tab to make a new page on their site, select html editing, and then paste in the desired HTML code. Note during page creation you specify the url for the page, but this url may not contain '*.htm' or '*.html' at the end. This is slightly cumbersome, but managable.

However the file creation operation adds between 30 kb and 230 kb of additional code to the content one uploads. I do not know why it varies, but its often 230kb for even a pretty simple file with 5 kb of source material. All of this additional code is in text format and is moderately readable, however a lot is various javascript functions. I have not had the interest or energy to trace through all of this additional source code which is added to the uploads. It is probably entirely benign but as a moderately sensitive web user I wish google sites published information about what this code is doing. For instance is it tracking your or my behavior? I have no idea. For the time being I am just ignoring it and accept it as the cost of using google sites services.

If you are part of a select group who downloads and saves documentation you find useful for offline use you may find the 230 kb overhead painful. In all the files I have created to date my original HTML source code has been sandwiched between <tbody> and </tbody> tags near the end of the material downloaded from google sites. Its easy to use a text editor to extract the original page, although a slight nuisance. I am attempting to save my original documentation pages with the source code distribution of the programs discussed in this file format section of my pages so you can avoid this hassle.

UPDATES

02/14/17  create google-sites page and add Iomega backup work
12/21/17  update with links to old MSbackup and Teledisk file formats

Comments