An Introduction to Network Programming - The art of transferring data

I assume that -more or less- you all know what a network is. But, just to make sure:
A network is a link between two or more ends, used to transfer something from one end to another.
There are railroad networks, telecommunications networks, broadcasting networks, and there also are computer networks.
Each of them carry something- people, supplies, signals / data- and all have a starting point and a destination. In most of the above examples, the starting point (or sender) can also be the destination (receiver), and vice versa (bidirectional networks).
Let's leave the trains out of this for now, and concentrate on computer networks.
Computer networks are wired or wireless links that are established between computers, and give to one end the ability to access data (stored or generated) on the other end, and (not always) vise versa.
By skipping several steps, and decades of evolution, we assume we have a computer connected to the Internet. According to the paragraph above, the internet, as a computer network, is used to transfer data from one computer to another. But how is this accomplished?

Let's see what happens every time you want to open a web page:

You turn on your computer (if it's one of the lucky machines that are being shut down occasionally, that is)
You dial your Internet Service Provider.
Your ISP checks your user name and password.
You open a browser and enter a valid url. A page opens up in your browser.

And now, let's see what sockets, protocols, IPs and ports have to do with the above.

IP: a machine's ID on a net.

Think of the IP as someone's telephone number, or as the address of a building, or as the name of a global variable of a computer program. Every of the above has to be unique, and is used to address it's owner (make a call, deliver a parcel, retrieve the variable's content) when needed.
Every machine on an IP based network is known by its IP. In order for your machine to become a member of such a network, it must have a unique IP (the familiar number in xxx.xxx.xxx.xxx format, for IPv4, or IP).
In step 3, an IP is automatically assigned to your machine by your ISP: Every ISP owns a range of unique IP numbers. When you call your ISP, one of its IPs is assigned to your machine- your communications device (modem/network card) informs your machine which that IP is. That particular IP is reserved, and won't be assigned to another machine, long as you stay connected.
In step 4 you typed a human readable name string (url). As said above though, a machine is known by its IP. So, there has to be a method to convert that name to an IP, for your machine to understand what you typed. This is what Domain Name Servers do. They try to translate a name to an IP. This procedure is called 'DNS lookup'. Whenever you type a url in your browser, a message containing the 'domain' (the www.domain.suffix part) is sent to such a server. If the name is unknown to the DNS, a 'not found' is returned, usually resulting in a 'Host not found' reply on your monitor.
If the name is known, its corresponding IP is returned to your machine. But, in order to send a message, knowing the IP is not enough. Another thing called 'port' is required.

What are 'ports' and why we need them?

According to the above paragraph, since we have an IP and we know the remote machine's IP, we should be able to start communicating, right? Well, sort of.
Imagine a server machine, running a web server and an ftp server application. Many remote machines will probably want to communicate with such a server. Some will ask for web pages, while others may want to transfer files.
Let's examine step 4 of the previous example again: your machine sends a message to the server, in order to retrieve a web page. Let's suppose that at the same time, another machine is sending an ftp request, in order to initiate a file transfer.
The same machine (server) accepts two requests at the same time. So, the server should be able to process both messages, tell whether it's a web page request, or an ftp command it received, and act accordingly. And processing takes time. Wouldn't it be better if the data it received had a stamp on them, declaring what type of data they are? Well, instead of that, they have a label, called 'port'. Port is a 16 bit number (0-65535) that is used for internal forwarding of the received data to the appropriate application.
At this point, we can start talking about 'packets'. Whenever you send a message, your computer wraps the data inside a container -think of it as a parcel/packet- and puts a tag on it, containing the ip and port of the machine they are addressed to. It then sends the packet to the ISP. From there on, routers take control, and, by reading the address they try to send the packet to the router they think is closer to the final recipient.
[ Since we speak of packets, it may be handy to know that the label of a packet also contains a number (hops) that is decreased by one every time the message is sent from one router to another. If the number reaches zero, the last router destroys the packet. This is a method to avoid having packets traveling eternally from router to router without ever reaching a destination.]

In conlusion:

A port (number) is for a machine what an IP is for a network.
Packets received by a machine (id=IP, handled by the router) are being unwrapped, and the original data are being forwarded to the application that 'listens' on a particular port (id=port, handled by the local machine).
As two machines of the same network can't share the same IP, two applications of the same machine shouldn't be listening on the same port- though they can be forced to, and sometimes we may even want them to.
Most known network services have a default port- a port which is usually (but not necessarily) used as the listening port for those services. E.g., for http servers, the default port is 80. This doesn't mean that you can't launch a server on port 5000, or on any other port within the range of 0-65535, assuming of course that the port is not already in use (by another application, or the OS itself). The only thing to keep in mind when using alternative ports is that the clients have to know about it. E.g. if you launch your web server at port 5000, the url http://www.domain.com won't work, since the browser will assume that your remote server listens at port 80. So, you have to also declare the port: http://www.domain.com:5000

The 'Socket' concept

So far we spoke about IPs and ports. But how can an application actually use them, in order to communicate with a remote application?
That's what 'sockets' are there for. Sockets are the means of communication between your application and the lower level mechanisms that are responsible for all network actions. Most operating systems support sockets, and many mid/hi-level commands (like connect, send, receive) share the same syntax and use common low-level protocols (os> nic>signal>nic>os), while an OS may also support extra socket commands.
With the use of sockets and common protocols, communication between applications running even on different platforms is achieved.
The main property (socket type) that describes a socket is the base (mid-level, far as your computer is concerned, low-level, far as e.g a C++ application is concerned) protocol it will use. The most commonly used sockets are tcp (stream) and udp (datagram) sockets.
In step 4 of our example, soon as the IP of the remote machine is retrieved, your browser creates a tcp socket and sends the 'connect' command to it, with the ip and destination port of the remote machine passed as arguments to the command. The OS's socket handling mechanism (winsock, for windows) takes control. If the connect command is successful both sockets (local and remote) are notified, and the notification is then passed to your application's callback handler, assuming one has been defined. The connection has been established. You can now start sending (issue the 'send' command to the socket, with your data as argument) and receiving data.

Notes:

In order to transfer data, two sockets (one on each machine) are required.
Not all sockets have to be connected to transfer data. In general, all stream sockets (e.g. tcp sockets) need to establish a connection prior to sending/receiving, while datagram sockets (e.g. udp sockets) don't.
Two sockets (usually) don't share the same port.
A socket can only be connected to one machine. At the server side of our web server example, a new socket is created for each client (browser) that connects to the server.
A TCP socket can send messages to just one socket (the one it's connected to).
A TCP socket can be in one of the following four states:
Unconnected: Socket Created, not connected.
Listening: Host socket, waiting for incoming connections. Doesn't transfer data.
Client: Can be considered as a host socket's offspring. A server creates a new client socket for each client that connects to the server.
Remote: Serverside, it is the socket created by the remote machine to connect to the server (to a client socket). Clientside, it's a socket created to connect to a remote machine.
A socket (TCP/UDP) can be bound to a local IP and port, but only right after creation (before it is converted to a listening or remote socket).
A bound socket can't be unbound- destroy it and create a new one instead.
A TCP socket that has been assigned a mode (listening, client or remote) can't change the mode it's operating in. It has to be destroyed.
A TCP socket that has established a connection (client/remote) can't disconnect and connect to another socket. It has to be destroyed.
A TCP listening socket can't stop listening- it has to be destroyed.
A UDP socket doesn't have to connect to exchange data. A destination can be defined when sending a message. All data addressed to the socket (ip&port) from any machine will be received.
A UDP socket can connect, but connection is virtual. It just means that the transactions (send/receive) are limited to the remote address specified- incoming data from other addresses will be discarded.

Protocols- The network's languages

Every communication requires a method. Usually, this method is called a language. On computer networks, it is called a protocol.
Most languages can be considered high-level, since the are based on lower-level methods / concepts:
Spoken language: meaning>words>mouth>sounds>ear>words>meaning
Director's Lingo: lingo command>C++ compiled code>cpu>C++ compiled code>result
Http protocol: http string>tcp packet>signal>tcp packet>http string
Ok, some steps have been skipped in the above, but in general you get the idea.
So, you could say that C++ is to lingo what the tcp protocol is to the http protocol… sort of.
As with all methods of communications, the higher the level, the easier to use, but also slower, since extra processing is required. The mind is (or at least should be) faster than the mouth.
From now on, we'll be calling protocols such as tcp and udp 'base' protocols, since this is as low as we'll be getting- no real need to go closer to the machine. And we'll be calling protocols such as http and ftp high-level protocols, since… well, that's what they are. They sit on top of other protocols, so they should look taller as well…

In general:

a base protocol is used to define the low-level encoding method that your OS will use in order to pack and send / receive and unpack the data. It may, or may not, require a connection, it may, or may not, support error checking and may be streaming or message-oriented.
a high-level protocol deals with the encoding of the actual data. When a browser sends a request to a web server, it actually sends an http encoded message- in other words, a string any application that 'speaks http' can understand. The message (string) that is generated by your browser is made available as is to the remote socket (ip:port, as described earlier). The application that has control of that socket (the web server in our example) retrieves the string from the socket, and then has to parse the data, to get what your browser wanted to do. The reverse procedure (data>http encoding>socket send>…) is then followed, in order for the reply to be returned to your browser.

TCP- Transport Control Protocol

TCP is by far the most commonly used protocol in today's networks. It is not the fastest, but it's a 'reliable' protocol. A protocol is usually called 'reliable' when it supports mechanisms that guarantee both the order and the error-free delivery of the message, well as the delivery itself.
TCP is a 'streaming' protocol. 'Streaming' is a protocol that has no fixed message size limit. A lengthy message is automatically split into parts by the lower level mechanisms, in accordance to the TCP protocol. Those parts are wrapped and sent to the remote machine. The remote machine receives those packets and then has to append the data contained in each packet to a local buffer, in order to reconstruct the original message.
Furthermore, in order to transfer data with the tcp protocol, a connection (socket to socket) must have been established between two machines.

UDP- User Datagram Protocol

UDP is a message oriented protocol. Message oriented protocols have a message size limit, which depends on the OS they are used on. UDP's message size limit is usually about 64KB. Any attempt to send a larger message will not result to automatic splitting, as with streaming protocols, but to an error- no data will be sent. Also, unlike TCP, with UDP there is no error auto correction, no guarantee that the messages will reach its destination, and there is a good chance that the message will not be received by the remote machine.
So, why is UDP useful, you may ask? For several reasons. First of all, the disadvantages mentioned above, can be considered as advantages in several cases:

Size limit: Limits are not good, but the fact that there is no splitting is. With TCP you have to check if what you got is the full message, or just a part of it- the rest should be following. With UDP, there is no such issue. If the message does not exceed the UDP limit, it will be delivered in one piece. Otherwise, the socket will refuse to send it.
No error correction / resend attempt on failure: Auto correction is good, but it also is reducing performance and increasing network traffic. In certain cases, you don't really care if the client got all the data. You just want to deliver the more data possible. An example of that might be live radio, or video. You don't really care if you miss a frame or two, long as the video keeps playing without delays. As for the reliability percentage of UDP, the less the routers involved the higher. Especially when on a LAN, the chances of losing a message is very low- an exception to that is exceeding the socket's buffer limit. Since we haven't covered buffers yet, just keep in mind that UDP messaging on a LAN is usually quite reliable.
No need for a connection to be established: With TCP, a remote client needs to connect first. No data can be expected prior to that. With UDP, a socket can accept messages from any machine that may send compatible (UDP) data to the socket's IP:port. The good thing is that you can always ask the socket to return you the IP and port of the sender- that info exists on every packet's label. Note that unlike with TCP, you can create a UDP server with using just one socket. Also, the fact that there is no need to establish a connection offers an additional increase in speed, especially in cases when all you want is to send a single message, instead of exchanging multiple messages.

UDP- Broadcasting & multicasting

An advantage of UDP relies on it's connectionless nature. The fact that with the UDP protocol you can send data without having to connect makes concepts as broadcasting and multicasting possible. Both of these methods require an IGMP (Internet Group Multicast Protocol) compatible router- almost all router boxes or software solutions used in LANs are IGMP compatible. However, that's not the case with the routers used on the Internet. Not yet at least. Broadcasting and especially multicasting is considered by many to be the next big thing of the Internet, since it can be the base for bandwidth demanding applications like live tv. Read on to find out why.

Broadcasting

When you want to send a message to a specific destination (IP:port), your message is sent first to the router, and then the router forwards the message to (another router or) the destination machine. Let's assume you want to send the same message to two machines. You should first send it to the one machine, and then to the other. What if you wanted to send the same message to all machines off the LAN?
With broadcasting, you can send a message to all machines on a subnet (e.g. all machines on your LAN) at once. In order to do so, you need to have a UDP socket created, and send a 'special' message. Actually, the only special thing about the message is that it is addressed to a subnet (255.255.255.255 will broadcast to all) instead of a particular IP. When such a message reaches the router, it forwards copies of it to all machines attached to it. Any machine that has a UDP socket listening at the port the message was addressed to will receive the message. As you may get, with broadcasting you can very easily create serverless applications on your LAN.

Steps to create a broadcast chat:

Create a UDP socket, and set it up to listen at a specific port (e.g. 6000)
Use the 'send' command on the socket, with “255.255.255.255” as the destination address, and 6000 as the destination port.

And that's it. If you create such an application and run it on two machines on the same LAN, any string one machine sends will be received by all (including the sender).

Multicasting

Broadcasting is easy, and fast. Yet, it has a major disadvantage. A broadcast message is forwarded to all machines on the LAN, which is a waste of bandwidth and an increase in traffic, if not all machines on the LAN are interested on the message.
That's a problem you don't have with multicasting.
I bet you have never seen a web server with the IP 230.1.10.10. And that's why this IP belongs in the range 224.0.1.1- 239.255.255.255. This is a range reserved for multicasting. For LAN applications, you can consider any IP within that scope to be a multicast group.

Let's see how multicasting works:

Create a UDP socket, and set it up to listen at a specific port (e.g. 6000)
Send the 'MulticastGroupJoin' command to the socket, with an IP from the multicast IPs scope as address.
Send a regular (udp) message to that IP:6000.

All machines that have used the MulticastGroupJoin to join the group, and are listening at port 6000 will receive the message.

Multicast groups are handled by the router.
You don't need to have joined a group to send a message to that group- you can send it from any UDP socket. However, you need to join in order to receive messages.
The address reported to your socket when a multicast message is received is not the address of the group, but the IP of the sender.
Both broadcasting and multicasting use standard UDP sockets. The only thing that changes is the destination(s) the router will forward the messages to.
You can leave a multicast group by sending the 'MulticastGroupLeave' command. When you do so, your socket is unaffected- the only thing that changes is that the router will stop forwarding to your machine messages sent to that multicast group's ip.
Multicasting is the most efficient and fastest method for sending messages to multiple destinations.

Note that 'MulticastGroupLeave' or 'join', well as other commands used in this document may not be the actual names of the commands, which may differ from OS to OS and from language to language.

ICMP- Internet Control Message Protocol

Some of you may have heard this one, most probably haven't. But almost all of you have used it. ICMP is the protocol the ping command uses. It's a message oriented protocol, and is mentioned in this document mostly to demonstrate that there are other base protocols besides tcp and udp that are widely used.
Though the current version of PowerLib's Socket Class doesn't support creating ICMP sockets on demand, another class of the Xtra, the plx_net class creates such a socket when using it's 'ping' command.

Higher Level Protocols- One step closer to the real world

So, we can rely on the low level mechanisms and base protocols to transfer data. But transferring raw binary data from one machine to another, is by it self almost useless. The receiver must know what to do with the data being sent to it.
Take a database file, for example. Unless you have a program that can process that data (speaking the database's language, if you prefer), they are quite useless.
A high-level protocol is a method of encoding data prior to sending them, and decoding them after they are received. Why all this encoding / decoding is required you may ask.
Imagine a web page that contains some text and an image. There has to be a way for the machine that receives the data to know where the text starts, where it ends, where the image data start and end, and what to do with all this info (how to display them).
That's why the http protocol has been invented (end expaaanded…).
The http protocol is used to transfer several types of data (text, html pages, images etc) in a way that the remote machine can understand.
It adds a header to each message, which informs the remote machine about the type and size of data that follow- as said above, the actual transferring is done by the low level mechanisms according to the base protocols' standards, so, what the http protocol (or, to be exact, an http builder/parser routine) really does is building strings, which then 'sends' to the TCP socket.
Parsing such data and displaying pages on the screen is what a browser does for a living.

A high-level protocol like http, ftp, smtp and MUS, is just a standard. The protocol by itself doesn't convert (encode or decode) data, it just describes a way to format data. A high-level protocol is to the encoded data what the JPG format is to a JPG encoded file- a description of the method to retrieve the content of interest from the encoded data (while a base protocol is to the packet what the method of storing data on a hard disc is to the file's binary content).
Applications that use a certain protocol, encode all data according to the specifications of that protocol before sending them, and decode any incoming data before they can actually use them.
If you can create a base protocol socket a high-level protocol is based on (e.g. a TCP socket for the HTTP protocol), then you can also create an application that can speak that protocol (e.g. a web server).
Not all known protocols are bandwidth-efficient. Have you ever wondered e.g. why every time you want to send a mail with a 1MB file attached, what you actually send is close to 1.3MB? This is caused by the base64 encoding used as the standard method for encoding attachments, which increases the size of the original by 33%.
Building a custom protocol for your own apps is usually the more efficient- and often simpler- way to go. And don't think that building a protocol is hard. It can be as simple as prefixing all strings you send with a string like '[#filename:myFile.txt, #filesize:100]'. Looks familiar? In order for your scripts to parse the incoming data, all that have to do is search for the first ']' character, to retrieve the header, which can be converted directly to a Director PropList. Then, you should expect that the next 100 bytes to be the content of the file 'myFile.txt'.

Servers, Hosts, Machines and Applications

The term 'server' is widely used. But what exactly do we mean by 'server'.
There are server machines, well as server applications. There also are applications that are called 'hosts'… And what is the difference between a server and a host, if any?
Since we've already spoken about sockets, let's start from there. When a socket can accept incoming connections, it is called a 'listening socket' or a 'socket in host mode', or, simply a 'host socket'. The applications that controls such socket(s) is called a host application, while the machine that runs the application can be called a host machine.
The difference between a host and a server is very vague… It's just a matter of naming.
But, in general, we call a 'server' a machine that can be accessed 24/7 by clients and runs an application that handles the client's requests.
For example, a web server should have a fixed IP, in order for other machines (clients) on the Internet to be able to reach it.
Having a fixed IP isn't a requirement for a server. However, since a server needs clients in order to have a purpose, clients should have a way to discover the server and connect to it. And the easiest way to achieve this (at least for IP based networks, such as the Internet) is by using a fixed, or static, IP.
So, you might say that if you create an application that controls a listening socket and the machine has a fixed Internet IP, then you have a server machine, running a server app. Or at least close to, since you also need to add to your application to actually do something with the data the clients will send.
In conclusion:

Server and host is basically the same thing
We usually call a host application an application that is temporarily open to the public, or to a specific or range of machines- e.g. a projector that uses a listening socket created to accept a p2p connection from a specific machine- and that, usually, can accept a single or a limited number of incoming connections.
We call a server application an application that can transform a computer to a server: an application that can handle many simultaneous connections, process the incoming data and send replies.
We call a server machine a machine (that runs a server application and) to which clients can connect and transfer data- at least when the server is on-line.
Hardware companies usually call 'servers' machines that are built for 24/7 usage, and are usually more powerful and stable than workstations or desktop machines.
Software companies call 'servers' programs that are built to process incoming requests from a large number of remote machines, and usually organize and store the data they receive.
And, finally, the server version of an OS is usually the 'full throttle' version of the OS, with no limits that 'workstation' or 'home' versions may have, Plus, 'server' OS versions usually contain ready-to-use server applications, like e.g. web, ftp and mail servers.

But, as said above, it's all a matter of naming.

Network programming VS Local programming

Network programming is not that different from local programming- after all, both kinds deal with data, and both use computer languages. However, there are two main differences.

Asynchronous Operations & Errors

When creating a single threaded computer program, you expect everything to happen sequentially. You send a command, and instantly (for light operations at least) get a result. You also expect that the command will be executed without an error. If an error occurs, then there is probably something wrong with your code, or with the user's configuration, or with the hardware. In computer programming errors are bad, and often crucial.
On the contrary, with network programming errors are a very common thing. And when an error occurs, you have no hardware or OS to blame. Errors may occur e.g. when a connection is dropped, or when a server is too busy. Your application must know how to handle such an error- e.g. it can display a warning, or silently try again instantly or later etc.
Also, with network programming, you must get used to the idea of background operations. Imagine if your browser stopped responding till a web page was fully loaded… Or a 10MB file was downloaded.
With network programming, callbacks and errors are your allies. They inform you about the result of events or incoming requests. At least this is the case with asynchronous operations. As for synchronous operations (often called 'blocking' in the socket's world) I'd say don't bother, unless you like the idea of machines that stop responding till any network request (send/get reply) is completed.
Just for the record, and excepting network errors, network programming is very similar to programming applications that utilize multiple threads.

Imagine for example, that you want to apply a very heavy filter on an image.

A single threaded program will just stop responding till the filter is applied.
A multi-threaded program could allow you to work on other images till the filter is applied. When done, the second thread (the one in which the filter will be applied) will send a message to the main thread to inform it that processing is complete.
With network programming, an application could send the image to a remote computer and tell that computer to apply the filter. When done, the remote machine will send the processed image back to the first application.

When building a server that is expected to do heavy processing, it's good to use a new thread to do the job. Otherwise, all network operations will pause till the work is done. Since Director doesn't support multiple threads, you could consider using a second (third, etc) projector to do the job: launch a projector, pass the data to it and tell it what to do. When done, the projector will send the data back to the main movie and then quit.