Programming/JavaScript

[EloquentJS] Ch13. JavaScript and the Browser

dododoo 2020. 4. 22. 14:40

Networks and the Internet

  • For example, the Hypertext Transfer Protocol (HTTP) is a protocol for retrieving named resources (chunks of information, such as web pages or pictures).
  • Most protocols are built on top of other protocols. HTTP treats the network as a streamlike device into which you can put bits and have them arrive at the correct destination in the correct order. As we saw in Chapter 11, ensuring those things is already a rather difficult problem.
  • The Transmission Control Protocol (TCP) is a protocol that addresses this problem.
  • A TCP connection works as follows: one computer must be waiting, or listening, for other computers to start talking to it. To be able to listen for different kinds of communication at the same time on a single machine, each listener has a number (called a port) associated with it. Most protocols specify which port should be used by default.
  • Another computer can then establish a connection by connecting to the target machine using the correct port number. If the target machine can be reached and is listening on that port, the connection is successfully created. The listening computer is called the server, and the connecting computer is called the client.
  • Such a connection acts as a two-way pipe through which bits can flow—the machines on both ends can put data into it. ... You could say that TCP provides an abstraction of the network.

The Web

  • The World Wide Web (not to be confused with the Internet as a whole) is a set of protocols and formats that allow us to visit web pages in a browser. The “Web” part in the name refers to the fact that such pages can easily link to each other, thus connecting into a huge mesh that users can move through.
  • To become part of the Web, all you need to do is connect a machine to the Internet and have it listen on port 80 with the HTTP protocol so that other computers can ask it for documents.
  • Each document on the Web is named by a Uniform Resource Locator (URL), which looks something like this:
  •  http://eloquentjavascript.net/13_browser.html
    |      |                      |               |
    protocol       server               path
  • The first part tells us that this URL uses the HTTP protocol (as opposed to, for example, encrypted HTTP, which would be https://). Then comes the part that identifies which server we are requesting the document from. Last is a path string that identifies the specific document (or resource) we are interested in.
  • IP address, domain name ...
  • If you type this URL into your browser’s address bar, the browser will try to retrieve and display the document at that URL. First, your browser has to find out what address eloquentjavascript.net refers to. Then, using the HTTP protocol, it will make a connection to the server at that address and ask for the resource /13_browser.html. If all goes well, the server sends back a document, which your browser then displays on your screen.

HTML

  • HTML, which stands for Hypertext Markup Language, is the document format used for web pages. An HTML document contains text, as well as tags that give structure to the text, describing things such as links, paragraphs, and headings.

  • <!doctype html>
    <html>
        <head>
            <meta charset="utf-8">
            <title>My home page</title>
        </head>
        <body>
            <h1>My home page</h1>
            <p>Hello, I am Marijn and this is my home page.</p>
            <p>I also wrote a book! Read it
                <a href="http://eloquentjavascript.net">here</a>.</p>
        </body>
    </html>
  • The tags, wrapped in angle brackets, provide information about the structure of the document. The other text is just plain text.

  • The document starts with <!doctype html>, which tells the browser to interpret the page as modern HTML, as opposed to various dialects that were in use in the past.

  • HTML documents have a head and a body. The head contains information about the document, and the body contains the document itself. In this case, the head declares that the title of this document is “My home page” and that it uses the UTF-8 encoding, which is a way to encode Unicode text as binary data. The document’s body contains a heading (<h1>, meaning “heading 1”—<h2> to <h6> produce subheadings) and two paragraphs (<p>).

  • Tags come in several forms. An element, such as the body, a paragraph, or a link, is started by an opening tag like <p> and ended by a closing tag like </p>. Some opening tags, such as the one for the link (<a>), contain extra information in the form of name="value" pairs. These are called attributes. In this case, the destination of the link is indicated with href="http://eloquentjavascript.net", where href stands for “hypertext reference”.

  • Some kinds of tags do not enclose anything and thus do not need to be closed. The metadata tag <meta charset="utf-8"> is an example of this.

  • To be able to include angle brackets in the text of a document, even though they have a special meaning in HTML, yet another form of special notation has to be introduced. A plain opening angle bracket is written as &lt; (“less than”), and a closing bracket is written as &gt; (“greater than”). In HTML, an ampersand (&) character followed by a name or character code and a semicolon (;) is called an entity and will be replaced by the character it encodes.

  • This is analogous to the way backslashes are used in JavaScript strings. Since this mechanism gives ampersand characters a special meaning, too, they need to be escaped as &amp;. Inside attribute values, which are wrapped in double quotes, &quot; can be used to insert an actual quote character.

  • HTML is parsed in a remarkably error-tolerant way. When tags that should be there are missing, the browser reconstructs them.

  • <!doctype html>
    
    <meta charset=utf-8>
    <title>My home page</title>
    
    <h1>My home page</h1>
    <p>Hello, I am Marijn and this is my home page.
    <p>I also wrote a book! Read it
        <a href=http://eloquentjavascript.net>here</a>.
  • The <html>, <head>, and <body> tags are gone completely. The browser knows that <meta> and <title> belong in the head and that <h1> means the body has started. Furthermore, I am no longer explicitly closing the paragraphs since opening a new paragraph or ending the document will close them implicitly. The quotes around the attribute values are also gone.

  • This book will usually omit the <html>, <head>, and <body> tags from examples to keep them short and free of clutter. But I will close tags and include quotes around attributes.

  • I will also usually omit the doctype and charset declaration. This is not to be taken as an encouragement to drop these from HTML documents. Browsers will often do ridiculous things when you forget them. You should consider the doctype and the charset metadata to be implicitly present in examples, even when they are not actually shown in the text.

HTML and JavaScript

  • In the context of this book, the most important HTML tag is <script>. This tag allows us to include a piece of JavaScript in a document.
  • <h1>Testing alert</h1>
    <script>aleat("hello!");</script>
  • Such a script will run as soon as its <script> tag is encountered while the browser reads the HTML.
  • Including large programs directly in HTML documents is often impractical. The <script> tag can be given an src attribute to fetch a script file (a text file containing a JavaScript program) from a URL.
  • <h1>Testing alert</h1>
    <script src="code/hello.js"></script>
  • The code/hello.js file included here contains the same program—alert("hello!"). When an HTML page references other URLs as part of itself—for example, an image file or a script—web browsers will retrieve them immediately and include them in the page.
  • A script tag must always be closed with </script>, even if it refers to a script file and doesn’t contain any code. If you forget this, the rest of the page will be interpreted as part of the script.
  • You can load ES modules in the browser by giving your script tag a type="module" attribute. Such modules can depend on other modules by using URLs relative to themselves as module names in import declarations.
  • Some attributes can also contain a JavaScript program. The <button> tag shown next (which shows up as a button) has an onclick attribute. The attribute’s value will be run whenever the button is clicked.
  • <button onclick="alert('Boom!');">DO NOT PRESS</button>
  • Note that I had to use single quotes for the string in the onclick attribute because double quotes are already used to quote the whole attribute. I could also have used &quot;.

In the sandbox

  • Yet the attraction of the Web is that you can browse it without necessarily trusting all the pages you visit. This is why browsers severely limit the things a JavaScript program may do: it can’t look at the files on your computer or modify anything not related to the web page it was embedded in.
  • Isolating a programming environment in this way is called sandboxing, the idea being that the program is harmlessly playing in a sandbox. But you should imagine this particular kind of sandbox as having a cage of thick steel bars over it so that the programs playing in it can’t actually get out.
  • The hard part of sandboxing is allowing the programs enough room to be useful yet at the same time restricting them from doing anything dangerous. Lots of useful functionality, such as communicating with other servers or reading the content of the copy-paste clipboard, can also be used to do problematic, privacy-invading things.
  • Every now and then, someone comes up with a new way to circumvent the limitations of a browser and do something harmful, ranging from leaking minor private information to taking over the whole machine that the browser runs on. The browser developers respond by fixing the hole, and all is well again—until the next problem is discovered, and hopefully publicized, rather than secretly exploited by some government agency or mafia.

Compatibility and the browser wars

  • ... The latest versions of the major browsers behave quite uniformly and have relatively few bugs.