Notes on How to set up and maintain Web pages


Where to put your Web pages

On your Linux/UNIX account on something like:

student.computing.dcu.ie 
in a directory called public_html:
cd
mkdir public_html
edit public_html/index.html
edit public_html/file.html
to make the files:
http://host/~user/
http://host/~user/file.html

The URL will be something like:

http://student.computing.dcu.ie/~username/


Web hosting in DCU:



Protections

The hierarchy of directories above the files needs to be executable by "others". See Notes on directory protections

cd
chmod o+x .
chmod o+x public_html
All files need to be readable by "others". See Notes on file protections
cd
chmod o+r public_html/index.html
chmod o+r public_html/file.html
chmod o+r public_html/image.jpg





How to write them

  1. Raw HTML in a text editor

  2. Raw HTML in an assisted text editor

  3. WYSIWYG




Minimalist web page

<html>
<head>
<title> My web page </title>
</head>
<body>

<h1> My web page </h1>

<p> I am a very interesting person
and here are my poems. </p>

<p> Here is a link to my favourite artist,
<a href="http://www.daniel-site.com/"> Daniel O'Donnell</a>.
I hope to marry him some day. </p>

</body>
</html>


Read HTML Reference.

"View .. Source" on other people's pages round the Web to scavenge them for ideas (be careful not to scavenge actual content (text or image) though!)



Some HTML tags.



Using other formats

(or converting other formats to HTML)




HTML plus images

HTML plus images is the most portable format, readable everywhere and on anything. Think of your users not just in the CA labs, but also at home, at work, abroad, on old machines and slow phone lines, on Web TV and palmtops. Why make them unable to read your work for no good reason. Use the lowest common denominator.

pdf, doc, ps, rtf, and anything that requires plug-ins in general, often break the clean Web model of browse-and-move on. Instead we get a dialog asking us to save to disk (Where? And will litter be left behind?) and run the plugin to view.

Of course, this is all about integration of the plugin with the browser - you can set it up so that the plugin launches automatically and the file is deleted afterward. But of course you have to go and get the plugin. And wait 2 hours while it downloads. And install it. And reboot. And you've got other things to do. And are you really bothered about reading this document anyway? There's lots of other stuff to read on the Net. So you hit the "Back" button, and move on.

With the browsers many people currently use, pdf, doc, ps, rtf simply means "not browsable", "off line".



HTML creates a seamless Web

Also, if it is in HTML, the content can be picked up in search engines (whereas content of ps, doc, pdf, etc. may be hidden).


But perhaps the most important reason to present everything in HTML is that people can link to it, can link to sub-sections within it, can link to labels within those sub-sections, and those sections in turn can link back out to everything else on the Web.



HTML is safe

Another reason not to use Microsoft Word documents is the risk of spreading viruses. This risk does not exist with the other formats (certainly not with HTML).

Even if you are confident your Word files are uninfected, think of your users. As I'm browsing data, unless something is absolutely essential to my job, if I see that somebody's data is in Word format, I simply won't read it.


Robert X. Cringely points out the genius of Microsoft's public relations:

The wonder of all these Internet security problems is that they are continually labeled as "e-mail viruses" or "Internet worms," rather than the more correct designation of "Windows viruses" or "Microsoft Outlook viruses."



How to browse them


Relative links




How to upload them

  1. Edit them directly off disk in UNIX, or:

  2. Edit them in Windows. Make UNIX account look like a drive:

See Accessing UNIX remotely and from Windows.




Search engines

See How to write a CGI script.

My search engine simply does a grep of all my web pages on the spot, and pipes the result through a filter that generates tidy HTML code (so you can click on the pages in the results).

How to write a search engine in 9 lines of Shell



CSS

With hundreds or more pages, you will want a common look and feel.


SSI

With hundreds or more pages, you will want to separate out code that is common to all pages.

SSI is invisible to client: