Websockets
- TCP: Stream-based communcation
- UDP: Message-based communication
- JavaScript in the browser: Can't do TCP or UDP, but it can do HTTP
- Maybe that's a good thing... (security)
HTTP
- Traditionally, you make a request and get a response
- Problem: Not nearly as flexible as UDP and TCP
- Workarounds:
- Polling
- Long-polling (Comet)
- Other tricks
- We want to send messages either way whenever
- We want them to arrive in order
- Solutions:
Newer technologies
- Websockets!
- WebRTC
- Streaming Media
- Messages (same API as Websockets)
- EventSource
- Sever-generated events
- Just HTTP
- Push API
- Can run when page is closed (if user allowes)
- Throttled/limited
- HTTP/2
Websockets
- Upgrade existing HTTP connections to a websocket
- Operate on the same port
- Use HTTP to start
- Try to be compatible with HTTP proxies
Why Websockets
- Full-duplex message-based communication
- Server can push whenever, client can push whenever
- Both sides can send data at the same time
- Connection stays open
- Don't need to poll
- Avoid big HTTP headers for each message
- Reuse existing technologies as much as possible
Websocket Handshake
Client makes a HTTP request:
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: soap, wamp
Sec-WebSocket-Version: 13
Server responds:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: wamp
No Ports
- Use GET to specify which Websocket
- 1 webserver can service multiple websocket services
- wss://server.example.com/mysocket
- Connection: Upgrade
- Upgrade: websocket
- Specify the protocol we're upgrading to
- Optional Sec-WebSocket-Protocol: wamp
- Specify the sub-protocol: there are some pre-made subprotocols if you don't want to use raw websockets
Key and Accept
- Brown M&M test
- Client sends Sec-WebSocket-Key with a random string
- Server
- takes the key, appends 258EAFA5-E914-47DA-95CA-C5AB0DC85B11
- Takes the SHA1 SUM
- Base64-encodes the SHA1 SUM
- Sends that with the Sec-WebSocket-Accept header
- Client knows for sure the server's ready for websocket
- Request:
- Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
- Response:
- Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
- Magic string: 258EAFA5-E914-47DA-95CA-C5AB0DC85B11
- base64(sha1SumBinary(key+guid))
$ echo -en dGhlIHNhbXBsZSBub25jZQ==258EAFA5-E914-47DA-95CA-C5AB0DC85B11 \
| sha1sum
b37a4f2cc0624f1690f64606cf385945b2bec4ea -
$ base64 -d | hexdump -C
s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
00000000 b3 7a 4f 2c c0 62 4f 16 90 f6 46 06 cf 38 59 45 |.zO,.bO...F..8YE|
00000010 b2 be c4 ea |....|
00000014
In WebSocket Protocol
- Client and server exchange "frames"
- Control frames (out of band)
- Close connection 0x8
- Ping 0x9 and Pong 0xA
- Data frames
- These are the messages, updates, events, RPCs, whatever you're exchanging with the server
Messages
- Sent in data frames
- Could be text or binary
- Size is known ahead of time
- Can be fragmented into multiple data frames
Frames
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
- FIN is final fragment in a single message
Compact binary header
- Byte 1: FIN bit, 3 reserved bits, 4-bit opcode
- Byte 2: Mask (enables XOR "encryption" mask) and payload len
- If payload len = 126, 16-bit payload len follows
- If payload len = 127, 64-bit payload len follows
- If mask is set, the masking key follows
- Finally the data
- Smallest fragment: 1byte message, no mask = 4 bytes!
- Largest possible header: 14 bytes
- Each message: just send fragments until FIN
- Don't need to worry about ordering: TCP is handling that
Opcodes
- 0x1 UTF-8 text (don't break characters)
- 0x2 binary
- 0x8 close
- 0x9 ping
- 0xA pong
Example:
- 0x01 0x03 0x48 0x65 0x6c "Hel"...
- 0x80 0x02 0x6c 0x6f ..."lo"
Why "mask"?
- Not for privacy (encryption)
- Prevent accidentally being parsed as HTTP!
- Websockets are supposed to work with existing infrastructure
- Maintainers worried about cache poisoning by sending fake looking
GET requests over websockets. – Bad proxy servers etc.
- Masking encodes and garbles a frame with a mask so that you can't
send a GET request in the plain
- Allows browsers to protect against malicious pages doing
bad things they shouldn't
- Of course we can't do anything about custom clients, which can send whatever they want over TCP
WebSocket URIs
- You can use ws://yourserver.com:9090/websockethandler/
- wss: is websocket secure
- Inherits TLS from the HTTPS connection used intially
- Same format as HTTP URI
Performance
- Better two-way communication
- Missing out on client side caching
- Reinvent the wheel (AKA TCP/UDP)
- Beat the firewall
- Doesn't fully replace XHR/fetch AJAX
Errors
- Bad UTF-8 encoding → close connection
- No real prescription other than to close the connection
- Closing is done by control frame, TLS, and TCP close
In the Browser
- JS code in the browser won't have access to fragments, masking, etc.
- Simple browser API:
- Open
- Send and receive messages
- Close
- Browser sanitizes everything, is in control to prevent malicious
web pages from exploiting your browser to do things like poison
proxies
WebSocket in JS
var ws = new WebSocket("ws://www.example.com/socketserver");
var ws = new WebSocket("ws://www.example.com/socketserver", [“proto1”, “proto2”]);
ws.send("A string");
var buffer = new ArrayBuffer(16);
var int32View = new Int32Array(buffer);
websocketInstance.send( int32View ); // send binary
ws.close();
Resources
- Mozilla WebSocket dev guide
- ByteArrays/Typed Arrays in JS
- Async Interactions Server Side
- Javascript Example using WebAudio
License
Copyright 2014-2023 ⓒ Abram Hindle
Copyright 2019-2023 ⓒ Hazel Victoria Campbell and contributors
The textual components and original images of this slide deck are
placed under the Creative Commons is licensed under a Creative Commons Attribution-ShareAlike 4.0 International
License.
Other images used under fair use and copyright their copyright holders.
License
Copyright (C) 2019-2023 Hazel Victoria Campbell
Copyright (C) 2014-2023 Abram Hindle and contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN.
01234567890123456789012345678901234567890123456789012345678901234567890123456789