-
Notifications
You must be signed in to change notification settings - Fork 32
/
htmlcss.tex
683 lines (580 loc) · 21.8 KB
/
htmlcss.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
\chapter{HTML and CSS}\label{s:htmlcss}
HTML is the standard way to represent documents for presentation in web browsers,
and CSS is the standard way to describe how it should look.
Both are more complicated than they should have been,
but in order to create web applications,
we need to understand a little of both.
\section{Formatting}\label{s:htmlcss-formatting}
An HTML \gref{g:document}{document} contains \gref{g:element}{elements} and text
(and possibly other things that we will ignore for now).
Elements are shown using \gref{g:tag}{tags}:
an opening tag \texttt{\htmltag{tagname}} shows where the element begins,
and a corresponding closing tag \texttt{\htmltag{/tagname}} (with a leading slash) shows where it ends.
If there's nothing between the two, we can write \texttt{\htmltag{tagname/}} (with a trailing slash).
A document's elements must form a \gref{g:tree}{tree} (\figref{f:htmlcss-tree}),
i.e.,
they must be strictly nested.
This means that if Y starts inside X,
Y must end before X ends,
so \texttt{\htmltag{X}{\ldots}\htmltag{Y}{\ldots}\htmltag{/Y}\htmltag{/X}} is legal,
but \texttt{\htmltag{X}{\ldots}\htmltag{Y}{\ldots}\htmltag{/X}\htmltag{/Y}} is not.
Finally,
every document should have a single \gref{g:root-element}{root element}\index{element!root} that encloses everything else,
although browsers aren't strict about enforcing this.
In fact,
most browsers are pretty relaxed about enforcing any kind of rules at all,
since most people don't obey them anyway.
\section{Text}\label{s:htmlcss-text}
The text in an HTML page is normal printable text.
However,
since \texttt{{\textless}} and \texttt{{\textgreater}} are used to show where tags start and end,
we must use \grefdex{g:escape-sequence}{escape sequences}{escape sequence} to represent them,
just as we use \texttt{\textbackslash{}"} to represented a literal double-quote character
inside a double-quoted string in JavaScript.
In HTML,
escape sequences are written \texttt{\&name;},
i.e.,
an ampersand, the name of the character, and a semi-colon.
A few common escape sequences are shown in \tblref{t:htmlcss-escapes}.
\begin{longtable}{lll}
Name & Escape Sequence & Character \\
Less than & \texttt{\<} & {\textless} \\
Greater than & \texttt{\>} & {\textgreater} \\
Ampersand & \texttt{\&} & \& \\
Copyright & \texttt{\©} & © \\
Plus/minus & \texttt{\±} & ± \\
Micro & \texttt{\µ} & µ \\
\caplbl{HTML Escapes}{t:htmlcss-escapes}
\end{longtable}
The first two are self-explanatory,
and \texttt{\&} is needed so that we can write a literal ampersand
(just as \texttt{\textbackslash{}\textbackslash{}} is needed in JavaScript strings so that we can write a literal backslash).
\texttt{\©}, \texttt{\±}, and \texttt{\µ} are usually not needed any longer,
since most editors will allow us to put non-ASCII characters directly into documents these days,
but occasionally we will run into older or stricter systems.
\section{Pages}\label{s:htmlcss-pages}
An HTML page should have:
\begin{itemize}
\item
a single \texttt{html} element that encloses everything else
\item
a single \texttt{head} element that contains information about the page\index{head (of HTML page)}
\item
a single \texttt{body} element that contains the content to be displayed.\index{body!of HTML page}
\end{itemize}
It doesn't matter whether or how we indent the tags showing these elements and the content they contain,
but laying them out on separate lines
and indenting to show nesting
helps human readers.
Well-written pages also use comments, just like code:
these start with \texttt{{\textless}!-\/-} and end with \texttt{-\/-{\textgreater}}.
Unfortunately,
comments cannot be nested,
i.e.,
if you comment out a section of a page that already contains a comment,
the results are unpredictable.
Here's an empty HTML page with the structure described above:
\begin{minted}{html}
<html>
<head>
<!-- description of page goes here -->
</head>
<body>
<!-- content of page goes here -->
</body>
</html>
\end{minted}
\noindent
Nothing shows up if we open this in a browser,
so let's add a little content:
\begin{minted}{html}
<html>
<head>
<title>This text is displayed in the browser bar</title>
</head>
<body>
<h1>Displayed Content Starts Here</h1>
<p>
This course introduces core features of <em>JavaScript</em>
and shows where and how to use them.
</p>
<!-- The word "JavaScript" is in italics (emphasis) in the preceding paragraph. -->
</body>
</html>
\end{minted}
\figpdf{figures/htmlcss-tree.pdf}{HTML as a Tree}{f:htmlcss-tree}
\begin{itemize}
\item
The \texttt{title} element inside \texttt{head} gives the page a title.
This is displayed in the browser bar when the page is open,
but is \emph{not} displayed as part of the page itself.
\item
The \texttt{h1} element is a level-1 heading;\index{heading (in HTML)}
we can use \texttt{h2}, \texttt{h3}, and so on to create sub-headings.
\item
The \texttt{p} element is a paragraph.
\item
Inside a heading or a paragraph,
we can use \texttt{em} to \emph{emphasize} text.
We can also use \texttt{strong} to make text \textbf{stronger}.
Tags like these are better than tags like \texttt{i} (for italics) or \texttt{b} (for bold)
because they signal intention rather than forcing a particular layout.
Someone who is visually impaired, or someone using a small-screen device,
may want emphasis of various kinds displayed in different ways.
\end{itemize}
\section{Attributes}\label{s:htmlcss-attributes}
Elements can be customized by giving them \grefdex{g:attribute}{attributes}{attribute},
which are written as \texttt{name="value"} pairs inside the element's opening tag.
For example:
\begin{minted}{html}
<h1 align="center">A Centered Heading</h1>
\end{minted}
\noindent
centers the \texttt{h1} heading on the page, while:
\begin{minted}{html}
<p class="disclaimer">This planet provided as-is.</p>
\end{minted}
\noindent
marks this paragraph as a disclaimer.
That doesn't mean anything special to HTML,
but as we'll see later,
we can define styles based on the \texttt{class} attributes of elements.
An attribute's name may appear at most once in any element,
just like a key can only appear once in any JavaScript object,
so \texttt{{\textless}p\ align="left"\ align="right"{\textgreater}{\ldots}\htmltag{/p}} is illegal.
If we want to give an attribute multiple values---for example,
if we want an element to have several classes---we put all the values in one string.
Unfortunately,
as the example below shows,
HTML is inconsistent about whether values should be separated by spaces or semi-colons:
\begin{minted}{html}
<p class="disclaimer optional" style="color: blue; font-size: 200%;">
\end{minted}
However they are separated,
values are supposed to be quoted,
but in practice we can often get away with \texttt{name=value}.
And for Boolean attributes whose values are just true or false,
we can even sometimes just get away with \texttt{name} on its own.
\section{Lists}\label{s:htmlcss-lists}
Headings and paragraphs are all very well,
but data scientists need more.
To create an unordered (bulleted) list,\index{list (in HTML page)}
we use a \texttt{ul} element,
and wrap each item inside the list in \texttt{li}.
To create an ordered (numbered) list,
we use \texttt{ol} instead of \texttt{ul},
but still use \texttt{li} for the list items.
\begin{minted}{html}
<ul>
<li>first</li>
<li>second</li>
<li>third</li>
</ul>
\end{minted}
\begin{itemize}
\item
first
\item
second
\item
third
\end{itemize}
\begin{minted}{html}
<ol>
<li>first</li>
<li>second</li>
<li>third</li>
</ol>
\end{minted}
\begin{enumerate}
\item
first
\item
second
\item
third
\end{enumerate}
Lists can be nested by putting the inner list's \texttt{ul} or \texttt{ol}
inside one of the outer list's \texttt{li} elements:
\begin{minted}{html}
<ol>
<li>Major A
<ol>
<li>minor p</li>
<li>minor q</li>
</ol>
</li>
<li>Major B
<ol>
<li>minor r</li>
<li>minor s</li>
</ol>
</li>
</ol>
\end{minted}
\begin{enumerate}
\item
Major A
\begin{enumerate}
\item
minor p
\item
minor q
\end{enumerate}
\item
Major B
\begin{enumerate}
\item
minor r
\item
minor s
\end{enumerate}
\end{enumerate}
\section{Tables}\label{s:htmlcss-tables}
Lists are a great way to get started,
but if we \emph{really} want to impress people with our data science skills,
we need tables.
Unsurprisingly,
we use the \texttt{table} element to create these.\index{table (in HTML page)}
Each row is a \texttt{tr} (for ``table row''),
and within rows,
column items are shown with \texttt{td} (for ``table data'')
or \texttt{th} (for ``table heading'').
\begin{minted}{html}
<table>
<tr> <th>Alkali</th> <th>Noble Gas</th> </tr>
<tr> <td>Hydrogen</td> <td>Helium</td> </tr>
<tr> <td>Lithium</td> <td>Neon</td> </tr>
<tr> <td>Sodium</td> <td>Argon</td> </tr>
</table>
\end{minted}
\begin{longtable}{ll}
Alkali & Noble Gas \\
Hydrogen & Helium \\
Lithium & Neon \\
Sodium & Argon \\
\end{longtable}
\noindent
Do \emph{not} use tables to create multi-column layouts:
there's a better way.
\section{Links}\label{s:htmlcss-links}
Links to other pages are what make HTML hypertext (\figref{f:htmlcss-links}).\index{link!in HTML page}
Confusingly,
the element used to show a link is called \texttt{a}.
The text inside the element is displayed and (usually) highlighted for clicking.
Its \texttt{href} attribute specifies what the link is pointing at;
both local filenames and URLs are supported.
Oh,
and we can use \texttt{\htmltag{br/}} to force a line break in text
(with a trailing slash inside the tag, since the \texttt{br} element doesn't contain any content):
\begin{minted}{html}
<a href="https://nodejs.org/">Node.js</a>
<br/>
<a href="https://facebook.github.io/react/">React</a>
<br/>
<a href="../index.html">home page (relative path)</a>
\end{minted}
\noindent
This appears as:
\begin{minted}{text}
Node.js
React
home page (relative path)
\end{minted}
\noindent
with the usual clickability.
\figpdf{figures/htmlcss-links.pdf}{Pages and Links}{f:htmlcss-links}
\section{Images}\label{s:htmlcss-images}
Images can be stored inside HTML pages in two ways:\index{image (in HTML page)}
by using SVG (which we will discuss in \chapref{s:vis})
or by encoding the image as text and including that text in the body of the page,
which is clever,
but makes the source of the pages very hard to read.
It is far more common to store each image in a separate file
and refer to that file using an \texttt{img} element
(which also allows us to use the image in many places without copying it).
The \texttt{src} attribute of the \texttt{img} tag specifies where to find the file;
as with the \texttt{href} attribute of an \texttt{a} element,
this can be either a URL or a local path.
Every \texttt{img} should also include a \texttt{title} attribute (whose purpose is self-explanatory)
and an \texttt{alt} attribute with some descriptive text to aid accessibility and search engines.\index{accessibility (in HTML page)}
(Again, we have wrapped and broken lines so that they will display nicely in the printed version.)
\begin{minted}{html}
<img src="./assets/logo.png" title="Book Logo"
alt="Displays the book logo using a local path" />
<img src="https://js4ds.org/assets/logo.png"
title="Book Logo"
alt="Display the book logo using a URL" />
\end{minted}
Two things to note here are:
\begin{enumerate}
\item
Since \texttt{img} elements don't contain any text,
they are often written with the trailing-slash notation.
However,
they are also often written improperly as \texttt{{\textless}img\ src="..."{\textgreater}} without any slashes at all.
Browsers will understand this,
but some software packages will complain.
\item
If an image file is referred to using a path rather than a URL,
that path can be either \grefdex{g:relative-path}{relative}{relative path}\index{path!relative}
or \grefdex{g:absolute-path}{absolute}{absolute path}\index{path!absolute}.
If it's a relative path,
it's interpreted starting from where the web page is located;
if it's an absolute path,
it's interpreted relative to wherever the web browser thinks
the \gref{g:root-directory}{root directory} of the filesystem is.
As we will see in \chapref{s:server},
this can change from one installation to the next,
so you should always try to use relative paths,
except where you can't.
It's all very confusing{\ldots}
\end{enumerate}
\section{Cascading Style Sheets}\label{s:htmlcss-css}
When HTML first appeared, people styled elements by setting their attributes:
\begin{minted}{html}
<html>
<body>
<h1 align="center">Heading is Centered</h1>
<p>
<b>Text</b> can be highlighted
or <font color="coral">colorized</font>.
</p>
</body>
</html>
\end{minted}
Many still do,
but a better way is to use \gref{g:css}{Cascading Style Sheets} (CSS).\index{CSS}
These allow us to define a style once and use it many times,
which makes it much easier to maintain consistency.
(We were going to say ``{\ldots}and keep pages readable'',
but given how complex CSS can be,
that's not a claim we feel we can make.)
Here's a page that uses CSS instead of direct styling:
\begin{minted}{html}
<html>
<head>
<link rel="stylesheet" href="simple-style.css" />
</head>
<body>
<h1 class="title">Heading is Centered</h1>
<p>
<span class="keyword">Text</span> can be highlighted
or <span class="highlight">colorized</span>.
</p>
</body>
</html>
\end{minted}
The \texttt{head} contains a link to an \gref{g:external-style-sheet}{external style sheet}\index{style sheet!external}
stored in the same directory as the page itself;
we could use a URL here instead of a relative path,
but the \texttt{link} element \emph{must} have the \texttt{rel="stylesheet"} attribute.
Inside the page,
we then set the \texttt{class} attribute of each element we want to style.\index{class!in HTML}
The file \texttt{simple-style.css} looks like this:
\begin{minted}{css}
h1.title {
text-align: center;
}
span.keyword {
font-weight: bold;
}
.highlight {
color: coral;
}
\end{minted}
\noindent
Each entry has the form \texttt{tag.class} followed by a group of properties inside curly braces,
and each property is a key-value pair.
We can omit the class and just write (for example):
\begin{minted}{css}
p {
font-style: italic;
}
\end{minted}
\noindent
in which case the style applies to everything with that tag.
If we do this,
we can override general rules with specific ones:
the style for a disclaimer paragraph is defined by \texttt{p} with overrides defined by \texttt{p.disclaimer}.
We can also omit the tag and simply use \texttt{.class},
in which case every element with that class has that style.
As suggested by the earlier discussion of separators,
elements may have multiple values for class,
as in \texttt{{\textless}span\ class="keyword\ highlight"{\textgreater}{\ldots}\htmltag{/span}}.
(The \texttt{span} element simply marks a region of text,
but has no effect unless it's styled.)
These features are one
(but unfortunately not the only)
common source of confusion with CSS:
if one may override general rules with specific ones
but also provide multiple values for class,
how do we keep track of which rules will apply to an element with multiple classes?
A detailed discussion of the order of precedence for CSS rules
is outside the scope of this tutorial. We recommend that those
likely to work often with stylesheets read (and consider bookmarking)
\hreffoot{https://www.w3schools.com/css/css\_specificity.asp}{this W3Schools page}.
One other thing CSS can do is match specific elements.
We can label particular elements uniquely within a page using the \texttt{id} attribute,\index{element!ID}
then refer to those elements using \texttt{\#name} as a \gref{g:selector}{selector}.
For example,
if we create a page that gives two spans unique IDs:
\begin{minted}{html}
<html>
<head>
<link rel="stylesheet" href="selector-style.css" />
</head>
<body>
<p>
First <span id="major">keyword</span>.
</p>
<p>
Full <span id="minor">explanation</span>.
</p>
</body>
</html>
\end{minted}
\noindent
then we can style those spans like this:
\begin{minted}{css}
#major {
text-decoration: underline red;
}
#minor {
text-decoration: overline blue;
}
\end{minted}
\begin{aside}{Internal Links}
We can link to an element in a page using \texttt{\#name}\index{link!internal}
inside the link's \texttt{href}:
for example,
\texttt{{\textless}a\ href="page.html\#place"{\textgreater}text\htmltag{/a}}
refers to the \texttt{\#place} element in \texttt{page.html}.
This is particularly useful \emph{within} pages:
\texttt{{\textless}a\ href="\#place"{\textgreater}jump\htmltag{/a}}
takes us straight to the \texttt{\#place} element within this page.
Internal links like this are often used for cross-referencing and to create a table of contents.
\end{aside}
\section{Bootstrap}\label{s:htmlcss-bootstrap}
CSS can become very complicated very quickly,
so most people use a framework to take care of the details.
One of the most popular is \hreffoot{https://getbootstrap.com/}{Bootstrap}\index{Bootstrap}
(which is what we're using to style this website).
Here's the entire source of a page that uses Bootstrap
to create a two-column layout with a banner at the top (\figref{f:htmlcss-bootstrap}):
\begin{minted}{html}
<html>
<head>
<link rel="stylesheet"
href="https://stackpath.bootstrapcdn.com/bootstrap/\
4.1.3/css/bootstrap.min.css">
<style>
div {
border: solid 1px;
}
</style>
</head>
<body>
<div class="jumbotron text-center">
<h1>Page Title</h1>
<p>Resize this page to see the layout adjust dynamically.</p>
</div>
<div class="container">
<div class="row">
<div class="col-sm-4">
<h2>First column is 4 wide</h2>
<p>Text here goes</p>
<p>in the column</p>
</div>
<div class="col-sm-8">
<h2>Second column is 8 wide</h2>
<p>Text over here goes</p>
<p>in the other column</p>
</div>
</div>
</div>
</body>
</html>
\end{minted}
\figimgscale{figures/htmlcss-bootstrap.png}{Bootstrap Layout}{f:htmlcss-bootstrap}{0.5}
The page opens by loading Bootstrap from the web;
we can also download \texttt{bootstrap.min.css} and refer to it with a local path.
(The \texttt{.min} in the file's name signals that the file has been \grefdex{g:minimization}{minimized}{minimization}
so that it will load more quickly.)
The page then uses a \texttt{style} element to create
an \gref{g:internal-style-sheet}{internal style sheet}\index{style sheet!internal}
to put a solid one-pixel border around every \texttt{div}
so that we can see the regions of the page more clearly.
Defining styles in the page header is generally a bad idea,
but it's a good way to test things quickly.
Oh,
and a \texttt{div} just marks a region of a page without doing anything to it,
just as a \texttt{span} marks a region of text without changing its appearance
unless we apply a style.
The first \texttt{div} creates a header box (called a ``jumbotron'') and centers its text.
The second \texttt{div} is a container,
which creates a bit of margin on the left and right sides of our content.
Inside that container is a row with two columns,
one 4/12 as wide as the row and the other 8/12 as wide.
(Bootstrap uses a 12-column system because 12 has lots of divisors.)
Bootstrap is \grefdex{g:responsive-design}{responsive}{responsive design}:
elements change size as the page grows narrower,
and are then stacked when the screen becomes too small to display them side by side.
We've left out many other aspects of HTML and CSS as well,
such as figure captions,
multi-column table cells,
and why it's so hard to center text vertically within a \texttt{div}.
One thing we will return to in \chapref{s:interactive} is
how to include interactive elements like buttons and forms in a page.
Handling those is part of why JavaScript was invented in the first place,
but we need more experience before tackling them.
\section{Exercises}\label{s:htmlcss-exercises}
\exercise{Cutting Corners}
What does your browser display if you forget to close a paragraph or list item tag
like this:
\begin{minted}{html}
<p>This paragraph starts but doesn't officially end.
<p>Another paragraph starts here but also doesn't end.
<ul>
<li>First item in the list isn't closed.
<li>Neither is the second.
</ul>
\end{minted}
\begin{enumerate}
\item
What happens if you don't close a \texttt{ul} or \texttt{ol} list?
\item
Is that behavior consistent with what happens when you omit \texttt{\htmltag{/p}} or \texttt{\htmltag{/li}}?
\end{enumerate}
\exercise{Mix and Match}
\begin{enumerate}
\item
Create a page that contains a 2x2 table,
each cell of which has a three-item bullet-point list.
How can you reduce the indentation of the list items within their cells using CSS?
\item
Open your page in a different browser (e.g., Firefox or Edge).
Do they display your indented lists consistently?
\item
Why do programs behave inconsistently?
Why do programmers do this to us?
Why?
Why why why why why?
\end{enumerate}
\exercise{Naming}
What does the \texttt{sm} in Bootstrap's \texttt{col-sm-4} and \texttt{col-sm-8} stand for?
What other options could you use instead?
Why do web developers still use FORTRAN-style names in the 21st Century?
\exercise{Color}
HTML and CSS define names for a small number of colors.
All other colors must be specified using \gref{g:rgb}{RGB} values.
Write a small JavaScript program that creates an HTML page
that displays the word \texttt{color} in 100 different randomly-generated colors.
Compare this to the color scheme used in your departmental website.
Which one hurts your eyes less?
\exercise{Units}
What different units can you use to specify text size in CSS?
What do they mean?
What does \emph{anything} mean, when you get right down to it?
\section*{Key Points}
\input{keypoints/htmlcss}