PB138 - Modern Markup Languages and Their Applications

Week 03 - HTML5

Tomáš Pitner

Outline

  • Introduction: HTML5 as the foundation of WWW
  • Commonly used elements
  • WAI ARIA
  • Regular DOM, Virtual DOM, Shadow DOM & AOM
  • Browser rendering
  • Minification, Compression

Introduction to HTML5

  • Part of basic web standards, currently on version HTML 5.2

  • HTML5 is used on websites and mobile and desktop devices/apps

  • Previous popular versions: XHTML and HTML 4 still in use

  • XHTML was the only HTML being a strict XML markup

  • On the other hand, HTML is based on old Structured Generalized Markup Language (SGML)

  • HTML5 needs

    • CSS for styling
    • JavaScript (ECMAScript) for scripting - interactivity

What is HTML5

  • HyperText Markup Language (HTML) 5
  • HTML5 is NOT XML in strict sense (some well-formedness rules ommited)
  • HTML5 is similar to original (SGML-based) versions of HTML
  • Extends previous versions by many new elements and API
    • multimedia (<audio>, <video>, <picture>, <embed>...)
    • document type, structure, metadata (<article>, <section>, <header>, <main>)
    • navigation and controls (<nav>, <menu>, <progress>, <menuitem>)
    • semantic markup (<time>, <samp>, <kbd>)
    • support for data in the document (<data>, <datalist>)

Example

<!DOCTYPE html>
<html lang="en">
  <head>
      <title>A simple HTML document</title>
  </head>
  <body>
      <p>Hello World!<p>
  </body>
</html>
  • <!DOCTYPE html> is the document type declaration (always like this)
  • <head> - contains metadata of the document, such as <title>, or links to css styles
  • <body> - contains the content, i.e. text, headings, links to js files...

Basic syntax

  • File extension .html (preferred) or .htm
  • Element names (tags) are case insensitive (lower case is best practice)
  • Attribute values are usually case sensitive(!)
  • Types of elements:
    • Block elements = create block (rectangle), e.g. paragraph
    • Inline elements = change formatting of some objects (text) in line
    • Semantic elements = intended meaning or purpose (article, main...)
    • Nonsemantic elements = other elements without meaning (div, span...)

Elements

  • HTML5 is a concrete markup, so only defined elements may be used
  • Element has one of the meanings:
    • structures the text
    • places images, frames, video...
    • defines styles
    • defines scripts

One paragraph of text with one line break inside

<p>This paragraph contains <br> a line break.</p>
  • Note: <br> is an empty element
  • In contrast to XML, we do not need to write "well-formed" <br/>

Nesting

Elements must be nested within each other properly.

<!-- Correct way of nesting -->
<div><span>These tags are nested properly.</span></div>

<!-- Incorrect way of nesting -->
<span><div>These tags are not nested properly.</span></div>

Exact nesting rules (which element can appear in which context) are defined in HTML5 Specification.

Comments

  • Same as (general) XML comments
  • HTML comment begins with <!--, and ends with -->
<!-- 
  Example of multiline comment
-->

<!-- Single line comment -->

Common block elements

  • <div> - division (general block element, usually styled)

  • <p> - paragraph (similar to <div>)

  • <h1> through <h6> - headings

  • <form> - form for inputs to submit to server

  • <textarea> - text area (big text fields)

  • <ol>, <ul> - ordered and unordered lists

  • <li> - list item

  • Block elements must not appear within inline elements!

Common inline elements

  • <a> - insert HTML link or link anchor
  • <span> - general inline element, usually styled
  • <code> - preformatted (non-proportional) font used
  • <input>, <button> - text input fields, buttons
  • <img> - insert image (behaves like inline-block)

The following elements are mainly used for inline semantics not styling:

  • <strong> - emphasized (typically bold) text
  • <b> - bold text
  • <em> - emphasized (typically italics) text
  • <i>, <u> - text in italics, underlined

Attributes

Specify details of an element (eg. image source, target URL, title, id)

<img src="images/smiley.png" width="30" height="30" alt="Smiley">
<a href="https://www.google.com/" title="Search Engine">Google</a>
<abbr title="Hyper Text Markup Language">HTML</abbr>
<input type="text" value="John Doe">

General attributes

Some attributes can be used by most elements:

  • id - unique ID of the element within the document
  • class - CSS class for the element
  • style - in-place style specification for the element (eg. color, size)
  • title - for many elements - specifies title (eg. for accessibility)
  • lang - language code for the text content, mainly used on html tag

Commonly used elements

Headings

  • Define structure of document - i.e. start of chapter, section
  • Not just bigger font
  • Important to use for search engines, SEO
  • Produce headings of decreasing font size
<h1>Heading level 1</h1> - biggest
<h2>Heading level 2</h2>
<h3>Heading level 3</h3>
<h4>Heading level 4</h4>
<h5>Heading level 5</h5>
<h6>Heading level 6</h6> - smallest

Text paragraphs

  • paragraphs start with <p> and may (preferred) or may not end with </p> (when used as separator)
  • <br> means just a line break, such as Shift + Enter in Word (not used for positioning)
<p>This is a paragraph <br> with line break.</p>
<p>This is <br>another paragraph <br> with line breaks.</p>

Paragraphs are usually (but not always) styled by assigning to a class

<p class="my-nice-para">This is a styled paragraph.</p>

Horizontal rulers

Horizontal line between block elements, eg. paragraphs

<p>This is a paragraph.</p>
<hr>
<p>This is another paragraph.</p>

Preformatted text

<pre>
  Twinkle, twinkle, little star, 
  How I wonder what you are! 
  Up above the world so high, 
  Like a diamond in the sky.
</pre>

Address

<address>
Mozilla Foundation<br>
331 E. Evelyn Avenue<br>
Mountain View, CA 94041, USA
</address>

Whitespaces

  • you can use either space char (ASCII 32), tab char, newline... everything is a white space
  • if you use a fixed whitespace entity &nbsp; (non-breaking space)
<p>This paragraph has multiple&nbsp;&nbsp;&nbsp;spaces.</p>
  • then you get
    • exact number of fixed-width spaces
    • connecting the text before and after as non-space characters

Links

  • Allow to click and browse from page to another
<a href="page.html">link to Page</a>
  • Can point and go even inside of document to specific anchor/ID:
<a href="mypage.html#topicA">Go to TopicA</a>
  • To open a new window or tab:
<a target="_blank" href="http://website.com">Open in new tab</a>

Text formatting

The following example contains elements with a purely semantic purpose

<p>This is <b>bold text</b>.</p>
<p>This is <strong>strongly important text</strong>.</p>
<p>This is <i>italic text</i>.</p>
<p>This is <em>emphasized text</em>.</p>
<p>This is <mark>highlighted text</mark>.</p>
<p>This is <code>computer code</code>.</p>
<p>This is <small>smaller text</small>.</p>
<p>This is <sub>subscript</sub> and <sup>superscript</sup> text.</p>
<p>This is <del>deleted text</del>.</p>
<p>This is <ins>inserted text</ins>.</p>
  • <strong> and <b> are usually the same but may differ, <strong> is more general
  • <em> and <i> are usually the same but may differ, <em> is more general

Quotes

Block quote - block element usually indented and in a different font

<blockquote>
    <p>Learn from yesterday, live for today, hope for tomorrow. 
      The important thing is not to stop questioning.</p>
    <cite>— Albert Einstein</cite>
</blockquote>

Inline quote

<p>According to the World Health Organization (WHO): <q>Health is a state of complete physical, mental, and social well-being.</q></p>

Citations

  • <cite> usually appears inside <blockquote>
  • Reference to a real source, eg. person, movie, book
  • Usually appears as italics
  • Does not produce any link etc.
<p>My favorite movie is <cite>star wars</cite>.</p>
<p>My another favorite movie is <cite>harry potter</cite>.</p>

Abbreviations

  • <abbr> explains an abbreaviation
<p>The <abbr title="World Wide Web Consortium">W3C</abbr> is the main international standards organization for the
<abbr title="World Wide Web">WWW or W3</abbr>. It was was founded by Tim Berners-Lee.</p>

Images

Inline element so it can be used also for small images on line

<img src="kites.jpg" alt="Flying Kites">
<img src="sky.jpg" alt="Cloudy Sky">
<img src="balloons.jpg" alt="Balloons">
  • src points to URL with the image source (file)
  • alt provides text alternative (description) for accessibility or slow connection
  • Note that <img> is an empty element, no need to write <img src=... />
  • Can be styled (via class or style)

Picture

Advanced new HTML5 element for multiple versions (eg. resolutions) of an image

<picture>
    <source media="(min-width: 1000px)" srcset="logo-large.png">
    <source media="(max-width: 500px)" srcset="logo-small.png">
    <img src="logo-default.png" alt="My logo">
</picture>

There is also image map <map> element for creating clickable map

Tables

Used for showcasing the data, not for formatting or layouting! For layouting use <div>s and CSS styles

<table>
    <tr>
        <th>No.</th>
        <th>Name</th>
        <th>Age</th>
    </tr>
    <tr>
        <td>1</td>
        <td>Peter Parker</td>
        <td>16</td>
    </tr>
</table>
  • <table> - encloses entire table, contains:

    • <tr> - table rows which contain cells
    • <th> - table heading cell (formatted differently)
    • <td> - simple table cell
  • Details inside of <table> element

    • <thead>
    • <tbody>
    • <tfoot>

Cells spanning multiple columns

<table>
    <tr>
        <th>Name</th>
        <th colspan="2">Phone</th>
    </tr>
    <tr>
        <td>John Carter</td>
        <td>5550192</td>
        <td>5550152</td>
    </tr>
</table>

Cells spanning multiple rows

<table>
    <tr>
        <th>Name:</th>
        <td>John Carter</td>
    </tr>
    <tr>
        <th rowspan="2">Phone:</th>
        <td>55577854</td>
    </tr>
    <tr>
        <td>55577855</td>
    </tr>
</table>

Captions

  • <caption> - allows to specify table caption (eg. title)
  • can be positioned elsewhere by styles' caption-side property
  • must be the first child element of <table>
<table>
    <caption>Users Info</caption>
    <tr>
        <th>No.</th>
        <th>Name</th>
        <th>Age</th>
    </tr>
    ...

Unordered list

Used to create a list of related items, in no particular order

<ul>
    <li>Chocolate Cake</li>
    <li>Black Forest Cake</li>
    <li>Pineapple Cake</li>
</ul>

Ordered list

Used to create a list of related items, in a specific order

<ol>
    <li>Fasten your seatbelt</li>
    <li>Starts the car's engine</li>
    <li>Look around and go</li>
</ol>

Description list

Definition list — Used to create a list of terms and their descriptions.

<dl>
    <dt>Bread</dt>
    <dd>A baked food made of flour.</dd>
    <dt>Coffee</dt>
    <dd>A drink made from roasted coffee beans.</dd>
</dl>

Forms

Input elements

  • Direct input from user - entering text, selecting option
  • <input> but also others (<textarea>, <select> and <option>)
  • Uniquely identified in form by both name="username" and id="username"
<form>
  <input type="text" name="username" id="username">
  <input type="password" name="password" id="password">
  <input type="hidden" name="browser" value="chrome">
</form>

Label

  • <label> provides description to specific input field
<form>
  <label for="username">Username:</label>
  <input type="text" name="username" id="username">
</form>

Checkbox and Radio button set

  • Radio buttons are selected as mutually exclusive
  • Radio buttons are usually round symbols and their labels
<form>
  <label for="terms">Do you agree with terms?</label>
  <input type="checkbox" name="terms" id="terms">
  <input type="radio" name="gender" id="female">
  <label for="female">Female</label>
</form>

Checkboxes

  • Checkboxes can be independently selected
  • Checkboxes are usually square symbols with indication when selected
<form>
  <input type="checkbox" name="sports" id="soccer">
  <label for="soccer">Soccer</label>
  <input type="checkbox" name="sports" id="cricket">
  <label for="cricket">Cricket</label>
</form>

File select

  • used for file upload facility
  • appears as button to select file
<form>
  <label for="file-select">Upload:</label>
  <input type="file" name="upload" id="file-select">
</form>

Textarea

  • Same as text field but multiline (rectangle area of rows * cols)
<form>
  <label for="address">Address:</label>
  <textarea rows="3" cols="30" name="address" id="address"></textarea>
</form>

Select from list

  • Select from dropdown list
  • Returns the selected option value
<label for="city">City:</label>
<select name="city" id="city">
  <option value="sydney">Sydney</option>
  <option value="melbourne">Melbourne</option>
  <option value="cromwell">Cromwell</option>
</select>

Buttons

Either generic button <button> or specific reset and submit (default) buttons

<form action="action.php" method="post">
  <label for="first-name">First Name:</label>
  <input type="text" name="first-name" id="first-name">
  <input type="submit" value="Submit">
  <input type="reset" value="Reset">
</form>

Grouping inputs

  • logically (and visually) group related inputs
  • may contain general description - <legend>
<fieldset>
  <legend>Contact Details</legend>
  <label>Email Address: <input type="email" name="email"></label>
  <label>Phone Number: <input type="text" name="phone"></label>
</fieldset>

Form element

  • Encloses the entire form to be submitted
  • Usually contains at least submit button
  • The input data is sent via HTTP POST (typically) or GET (by default) methods
  • GET sends the data in URL while POST sends data in request body
  • Controlled by form attributes:
    • name of the form (mostlyfor scripting)
    • action - target for processing the form
    • method - either get or post
    • target - where to display the response (default: _self)
    • enctype - character encoding for POST data
<form action="action.php" method="post"></form>

WAI ARIA

Web Accessibility Initiative’s Accessible Rich Internet Applications

  • Describe roles, states and properties

  • Provides recognition and usability for users with assistive technology (Screen Reader, Keyboard navigation)

  • Structured access to the website

  • Types of Aria attributes:

    • Indication of widgets (role="alert", role="modal", aria-label="Choose action")
    • Indication of structure (role="main", role="article")
    • Indication of state (aria-expanded="false")
    • Indication of properties (aria-haspopup="true")
  • Also known as a11y (accessibility)

Example for image

<p id="dimg">
A long description of our paragliding trip ...
</p>

<div>
<img src="takeoff.png"
	 alt="Getting ready to take off"
	 aria-labelledby="limg"
	 aria-describedby="dimg">
</div>

Example with dropdown widget

<div class="dropdown">
  <button type="button" aria-haspopup="true" aria-expanded="false" id="dropdown-1">
    Choose action
  </button>
  <div aria-labelledby="dropdown-1">
    <a href="#">First action</a>
    <a href="#">Second action</a>
  </div>
</div>

HTML5 and Accessibility

<nav> vs <div role="navigation">

Not fully supported, thus recommended to use aria* attributes

Source

Regular DOM, Virtual DOM, Shadow DOM & AOM

Document Object Models

Regular DOM

  • Browsers use internal representation of DOM model
  • Elements are nodes -> Programming languages interact with it
  • HTML uses extended DOM called HTML DOM, that provides additional API
    • Access to and control of HTML elements via the DOM
    • Access to and manipulation of form data
    • Supporting and connective interfaces for other APIs
    • ...many more
  • Bound to displayed content on the website
const paragraphs = document.querySelectorAll("p");
// paragraphs[0] is the first <p> element
// paragraphs[1] is the second <p> element, etc.
alert(paragraphs[0].nodeName);

Virtual DOM

  • Abstraction over the HTML DOM
  • Unlike Regular DOM, Virtual is not connected to the HTML DOM
  • Updates only relevant nodes (Reconciliation) -> optimised and accelared process
    • Minimises update operations
    • Diffing algorithm (diffs HTML DOM tree and Virtual DOM tree)
    • Reduces O(n3) tree compare down to O(n)
  • Used by frameworks Vue.js, React.js...
const root = React.createElement('div');
ReactDOM.render(root, document.getElementById('example'))

Shadow DOM

  • Used for encapsulated web components

  • Keeps behaviour, markup and style separate from other code

  • Consists of:

    • Shadow host - The regular DOM node that shadow DOM is attached to
    • Shadow tree - The dom inside of shadow DOM
    • Shadow boundary - Boundary between Shadow tree and Regular Tree
    • Shadow root - Root node of shadow tree
  • Example elements having shadow DOM: video, web components

/* Attach shadow dom to element */
let shadow = elementRef.attachShadow({mode: 'open'});
/* Interact with shadow dom */
let para = document.createElement('p');
shadow.appendChild(para);

AOM DOM

  • Currently just a draft from 23th June of 2021 (backed by Google, Apple and Mozilla)
  • Aims to improve accesibility for assistive technologies

Browser rendering

  • Browser fetches the HTML file
  • HTML is parsed to HTML DOM tree
  • CSS is parsed into CSSOM tree
  • CSSOM tree and HTML DOM tree are combined to render tree
  • Layout computes exact positions and sizes of elements
  • Browser creates the layers of elements
    • During rasterization missing parts such as background color, shadows are filled in
  • Layers are painted by the browser on the screen

This is called Critical rendering path

Minification, Compression

  • Techniques used to optimise fetching process from the web-server

Minification

  • Process of removing all unnecessary characters (whitespaces)
    • Replaces variables, function names with shorter naming
    • Reduces size of source files (css,js,html)
  • Mainly for all the website text resources
  • Implemented by web developers
function a(){return Math.floor(6*Math.random())+1}function b(){return new c.Promise(function(a,b){var c=Math.floor(6*Math.random())+1

Compression

  • Increases the perfomance of the website
  • Implemented on browsers/clients and servers
    • Loss-less compression - data is not altered, matches byte to byte with original
    • Loosy - original data is altered
  • Offically used: gzip, brotli

Resources

HTML5 dostalo aj niekoľko aktualizácii, aktuálne je verzii HTML 5.2.

Medzi HTML5 API patria napríklad: - Status batérie, podpora Clipboardu, súborové API. Offline aplikácie, Geolokácia, História navigácie

Semantické ako h1 alebo strong nám pomáhajú dať význam použitým elementom.

Treba zdôrazniť, že sémantické elementy

Pri br zmieniť, že sa nemá používa na pozicovanie elementov na to máme margin a iné css možnosti.

_blank je dobré využiť ak chceme užívateľovi neprerušiť uložený stav na stránke.

@todo Replace with example

Prehliadač následne vyberie obrázok podľa podporovaných formátov prípadne podľa veľkosti.

Typov inputov existuje mnoho: - password skrýva vložený text - color výber farby - hidden užívateľ nevidí - range výber - search, tel, time, url, week, date, checkbox

Existuje aj type button, submit je defaoult hodnota

Štruktúrovaný obsah zabezpečí, že čítačka sa sústredi na sémanticky dôležité časti stránky.

Nie je úplná podpora browserov pre a11y, pri použití HTML5 elementov

Aké modely domu máme a k čomu sú používané

Reprezentácia webovej stránky v pamäti, môže a nemusí byť reprezentovaná ako strom, záleží na internej implementácii prehliadača. V kóde je ukážka manipulácie s DOM