An Introduction to Network Programming - The art of transferring data


I assume that -more or less- you all know what a network is. But, just to make sure:
A network is a link between two or more ends, used to transfer something from one end to another.
There are railroad networks, telecommunications networks, broadcasting networks, and there also are computer networks.
Each of them carry something- people, supplies, signals / data- and all have a starting point and a destination. In most of the above examples, the starting point (or sender) can also be the destination (receiver), and vice versa (bidirectional networks).
Let's leave the trains out of this for now, and concentrate on computer networks.
Computer networks are wired or wireless links that are established between computers, and give to one end the ability to access data (stored or generated) on the other end, and (not always) vise versa.
By skipping several steps, and decades of evolution, we assume we have a computer connected to the Internet. According to the paragraph above, the internet, as a computer network, is used to transfer data from one computer to another. But how is this accomplished?

Let's see what happens every time you want to open a web page:
  1. You turn on your computer (if it's one of the lucky machines that are being shut down occasionally, that is)
  2. You dial your Internet Service Provider.
  3. Your ISP checks your user name and password.
  4. You open a browser and enter a valid url. A page opens up in your browser.

And now, let's see what sockets, protocols, IPs and ports have to do with the above.


IP: a machine's ID on a net.

Think of the IP as someone's telephone number, or as the address of a building, or as the name of a global variable of a computer program. Every of the above has to be unique, and is used to address it's owner (make a call, deliver a parcel, retrieve the variable's content) when needed.
Every machine on an IP based network is known by its IP. In order for your machine to become a member of such a network, it must have a unique IP (the familiar number in xxx.xxx.xxx.xxx format, for IPv4, or IP).
In step 3, an IP is automatically assigned to your machine by your ISP: Every ISP owns a range of unique IP numbers. When you call your ISP, one of its IPs is assigned to your machine- your communications device (modem/network card) informs your machine which that IP is. That particular IP is reserved, and won't be assigned to another machine, long as you stay connected.
In step 4 you typed a human readable name string (url). As said above though, a machine is known by its IP. So, there has to be a method to convert that name to an IP, for your machine to understand what you typed. This is what Domain Name Servers do. They try to translate a name to an IP. This procedure is called 'DNS lookup'. Whenever you type a url in your browser, a message containing the 'domain' (the www.domain.suffix part) is sent to such a server. If the name is unknown to the DNS, a 'not found' is returned, usually resulting in a 'Host not found' reply on your monitor.
If the name is known, its corresponding IP is returned to your machine. But, in order to send a message, knowing the IP is not enough. Another thing called 'port' is required.


What are 'ports' and why we need them?

According to the above paragraph, since we have an IP and we know the remote machine's IP, we should be able to start communicating, right? Well, sort of.
Imagine a server machine, running a web server and an ftp server application. Many remote machines will probably want to communicate with such a server. Some will ask for web pages, while others may want to transfer files.
Let's examine step 4 of the previous example again: your machine sends a message to the server, in order to retrieve a web page. Let's suppose that at the same time, another machine is sending an ftp request, in order to initiate a file transfer.
The same machine (server) accepts two requests at the same time. So, the server should be able to process both messages, tell whether it's a web page request, or an ftp command it received, and act accordingly. And processing takes time. Wouldn't it be better if the data it received had a stamp on them, declaring what type of data they are? Well, instead of that, they have a label, called 'port'. Port is a 16 bit number (0-65535) that is used for internal forwarding of the received data to the appropriate application.
At this point, we can start talking about 'packets'. Whenever you send a message, your computer wraps the data inside a container -think of it as a parcel/packet- and puts a tag on it, containing the ip and port of the machine they are addressed to. It then sends the packet to the ISP. From there on, routers take control, and, by reading the address they try to send the packet to the router they think is closer to the final recipient.
[ Since we speak of packets, it may be handy to know that the label of a packet also contains a number (hops) that is decreased by one every time the message is sent from one router to another. If the number reaches zero, the last router destroys the packet. This is a method to avoid having packets traveling eternally from router to router without ever reaching a destination.]

In conlusion:

The 'Socket' concept

So far we spoke about IPs and ports. But how can an application actually use them, in order to communicate with a remote application?
That's what 'sockets' are there for. Sockets are the means of communication between your application and the lower level mechanisms that are responsible for all network actions. Most operating systems support sockets, and many mid/hi-level commands (like connect, send, receive) share the same syntax and use common low-level protocols (os> nic>signal>nic>os), while an OS may also support extra socket commands.
With the use of sockets and common protocols, communication between applications running even on different platforms is achieved.
The main property (socket type) that describes a socket is the base (mid-level, far as your computer is concerned, low-level, far as e.g a C++ application is concerned) protocol it will use. The most commonly used sockets are tcp (stream) and udp (datagram) sockets.
In step 4 of our example, soon as the IP of the remote machine is retrieved, your browser creates a tcp socket and sends the 'connect' command to it, with the ip and destination port of the remote machine passed as arguments to the command. The OS's socket handling mechanism (winsock, for windows) takes control. If the connect command is successful both sockets (local and remote) are notified, and the notification is then passed to your application's callback handler, assuming one has been defined. The connection has been established. You can now start sending (issue the 'send' command to the socket, with your data as argument) and receiving data.

Notes:

Protocols- The network's languages

Every communication requires a method. Usually, this method is called a language. On computer networks, it is called a protocol.
Most languages can be considered high-level, since the are based on lower-level methods / concepts:
Spoken language: meaning>words>mouth>sounds>ear>words>meaning
Director's Lingo: lingo command>C++ compiled code>cpu>C++ compiled code>result
Http protocol: http string>tcp packet>signal>tcp packet>http string
Ok, some steps have been skipped in the above, but in general you get the idea.
So, you could say that C++ is to lingo what the tcp protocol is to the http protocol… sort of.
As with all methods of communications, the higher the level, the easier to use, but also slower, since extra processing is required. The mind is (or at least should be) faster than the mouth.
From now on, we'll be calling protocols such as tcp and udp 'base' protocols, since this is as low as we'll be getting- no real need to go closer to the machine. And we'll be calling protocols such as http and ftp high-level protocols, since… well, that's what they are. They sit on top of other protocols, so they should look taller as well…

In general:

TCP- Transport Control Protocol

TCP is by far the most commonly used protocol in today's networks. It is not the fastest, but it's a 'reliable' protocol. A protocol is usually called 'reliable' when it supports mechanisms that guarantee both the order and the error-free delivery of the message, well as the delivery itself.
TCP is a 'streaming' protocol. 'Streaming' is a protocol that has no fixed message size limit. A lengthy message is automatically split into parts by the lower level mechanisms, in accordance to the TCP protocol. Those parts are wrapped and sent to the remote machine. The remote machine receives those packets and then has to append the data contained in each packet to a local buffer, in order to reconstruct the original message.
Furthermore, in order to transfer data with the tcp protocol, a connection (socket to socket) must have been established between two machines.


UDP- User Datagram Protocol

UDP is a message oriented protocol. Message oriented protocols have a message size limit, which depends on the OS they are used on. UDP's message size limit is usually about 64KB. Any attempt to send a larger message will not result to automatic splitting, as with streaming protocols, but to an error- no data will be sent. Also, unlike TCP, with UDP there is no error auto correction, no guarantee that the messages will reach its destination, and there is a good chance that the message will not be received by the remote machine.
So, why is UDP useful, you may ask? For several reasons. First of all, the disadvantages mentioned above, can be considered as advantages in several cases:


UDP- Broadcasting & multicasting

An advantage of UDP relies on it's connectionless nature. The fact that with the UDP protocol you can send data without having to connect makes concepts as broadcasting and multicasting possible. Both of these methods require an IGMP (Internet Group Multicast Protocol) compatible router- almost all router boxes or software solutions used in LANs are IGMP compatible. However, that's not the case with the routers used on the Internet. Not yet at least. Broadcasting and especially multicasting is considered by many to be the next big thing of the Internet, since it can be the base for bandwidth demanding applications like live tv. Read on to find out why.

Broadcasting

When you want to send a message to a specific destination (IP:port), your message is sent first to the router, and then the router forwards the message to (another router or) the destination machine. Let's assume you want to send the same message to two machines. You should first send it to the one machine, and then to the other. What if you wanted to send the same message to all machines off the LAN?
With broadcasting, you can send a message to all machines on a subnet (e.g. all machines on your LAN) at once. In order to do so, you need to have a UDP socket created, and send a 'special' message. Actually, the only special thing about the message is that it is addressed to a subnet (255.255.255.255 will broadcast to all) instead of a particular IP. When such a message reaches the router, it forwards copies of it to all machines attached to it. Any machine that has a UDP socket listening at the port the message was addressed to will receive the message. As you may get, with broadcasting you can very easily create serverless applications on your LAN.

Steps to create a broadcast chat:
  1. Create a UDP socket, and set it up to listen at a specific port (e.g. 6000)
  2. Use the 'send' command on the socket, with “255.255.255.255” as the destination address, and 6000 as the destination port.

And that's it. If you create such an application and run it on two machines on the same LAN, any string one machine sends will be received by all (including the sender).


Multicasting

Broadcasting is easy, and fast. Yet, it has a major disadvantage. A broadcast message is forwarded to all machines on the LAN, which is a waste of bandwidth and an increase in traffic, if not all machines on the LAN are interested on the message.
That's a problem you don't have with multicasting.
I bet you have never seen a web server with the IP 230.1.10.10. And that's why this IP belongs in the range 224.0.1.1- 239.255.255.255. This is a range reserved for multicasting. For LAN applications, you can consider any IP within that scope to be a multicast group.

Let's see how multicasting works:
  1. Create a UDP socket, and set it up to listen at a specific port (e.g. 6000)
  2. Send the 'MulticastGroupJoin' command to the socket, with an IP from the multicast IPs scope as address.
  3. Send a regular (udp) message to that IP:6000.
All machines that have used the MulticastGroupJoin to join the group, and are listening at port 6000 will receive the message.


Note that 'MulticastGroupLeave' or 'join', well as other commands used in this document may not be the actual names of the commands, which may differ from OS to OS and from language to language.


ICMP- Internet Control Message Protocol

Some of you may have heard this one, most probably haven't. But almost all of you have used it. ICMP is the protocol the ping command uses. It's a message oriented protocol, and is mentioned in this document mostly to demonstrate that there are other base protocols besides tcp and udp that are widely used.
Though the current version of  PowerLib's Socket Class doesn't support creating ICMP sockets on demand, another class of the Xtra, the plx_net class creates such a socket when using it's 'ping' command.


Higher Level Protocols- One step closer to the real world

So, we can rely on the low level mechanisms and base protocols to transfer data. But transferring raw binary data from one machine to another, is by it self almost useless. The receiver must know what to do with the data being sent to it.
Take a database file, for example. Unless you have a program that can process that data (speaking the database's language, if you prefer), they are quite useless.
A high-level protocol is a method of encoding data prior to sending them, and decoding them after they are received. Why all this encoding / decoding is required you may ask.
Imagine a web page that contains some text and an image. There has to be a way for the machine that receives the data to know where the text starts, where it ends, where the image data start and end, and what to do with all this info (how to display them).
That's why the http protocol has been invented (end expaaanded…).
The http protocol is used to transfer several types of data (text, html pages, images etc) in a way that the remote machine can understand.
It adds a header to each message, which informs the remote machine about the type and size of data that follow- as said above, the actual transferring is done by the low level mechanisms according to the base protocols' standards, so, what the http protocol (or, to be exact, an http builder/parser routine) really does is building strings, which then 'sends' to the TCP socket.
Parsing such data and displaying pages on the screen is what a browser does for a living.
  1. A high-level protocol like http, ftp, smtp and MUS, is just a standard. The protocol by itself doesn't convert (encode or decode) data, it just describes a way to format data. A high-level protocol is to the encoded data what the JPG format is to a JPG encoded file- a description of the method to retrieve the content of interest from the encoded data (while a base protocol is to the packet what the method of storing data on a hard disc is to the file's binary content).
  2. Applications that use a certain protocol, encode all data according to the specifications of that protocol before sending them, and decode any incoming data before they can actually use them.
  3. If you can create a base protocol socket a high-level protocol is based on (e.g. a TCP socket for the HTTP protocol), then you can also create an application that can speak that protocol (e.g. a web server).
  4. Not all known protocols are bandwidth-efficient. Have you ever wondered e.g. why every time you want to send a mail with a 1MB file attached, what you actually send is close to 1.3MB? This is caused by the base64 encoding used as the standard method for encoding attachments, which increases the size of the original by 33%.
  5. Building a custom protocol for your own apps is usually the more efficient- and often simpler- way to go. And don't think that building a protocol is hard. It can be as simple as prefixing all strings you send with a string like '[#filename:myFile.txt, #filesize:100]'. Looks familiar? In order for your scripts to parse the incoming data, all that have to do is search for the first ']' character, to retrieve the header, which can be converted directly to a Director PropList. Then, you should expect that the next 100 bytes to be the content of the file 'myFile.txt'.



Servers, Hosts, Machines and Applications

The term 'server' is widely used. But what exactly do we mean by 'server'.
There are server machines, well as server applications. There also are applications that are called 'hosts'… And what is the difference between a server and a host, if any?
Since we've already spoken about sockets, let's start from there. When a socket can accept incoming connections, it is called a 'listening socket' or a 'socket in host mode', or, simply a 'host socket'. The applications that controls such socket(s) is called a host application, while the machine that runs the application can be called a host machine.
The difference between a host and a server is very vague… It's just a matter of naming.
But, in general, we call a 'server' a machine that can be accessed 24/7 by clients and runs an application that handles the client's requests.
For example, a web server should have a fixed IP, in order for other machines (clients) on the Internet to be able to reach it.
Having a fixed IP isn't a requirement for a server. However, since a server needs clients in order to have a purpose, clients should have a way to discover the server and connect to it. And the easiest way to achieve this (at least for IP based networks, such as the Internet) is by using a fixed, or static, IP.
So, you might say that if you create an application that controls a listening socket and the machine has a fixed Internet IP, then you have a server machine, running a server app. Or at least close to, since you also need to add to your application to actually do something with the data the clients will send.
In conclusion:
  1. Server and host is basically the same thing
  2. We usually call a host application an application that is temporarily open to the public, or to a specific or range of machines- e.g. a projector that uses a listening socket created to accept a p2p connection from a specific machine- and that, usually, can accept a single or a limited number of incoming connections.
  3. We call a server application an application that can transform a computer to a server: an application that can handle many simultaneous connections, process the incoming data and send replies.
  4. We call a server machine a machine (that runs a server application and) to which clients can connect and transfer data- at least when the server is on-line.
  5. Hardware companies usually call 'servers' machines that are built for 24/7 usage, and are usually more powerful and stable than workstations or desktop machines.
  6. Software companies call 'servers' programs that are built to process incoming requests from a large number of remote machines, and usually organize and store the data they receive.
  7. And, finally, the server version of an OS is usually the 'full throttle' version of the OS, with no limits that 'workstation' or 'home' versions may have, Plus, 'server' OS versions usually contain ready-to-use server applications, like e.g. web, ftp and mail servers.
But, as said above, it's all a matter of naming.


Network programming VS Local programming

Network programming is not that different from local programming- after all, both kinds deal with data, and both use computer languages. However, there are two main differences.


Asynchronous Operations & Errors

When creating a single threaded computer program, you expect everything to happen sequentially. You send a command, and instantly (for light operations at least) get a result. You also expect that the command will be executed without an error. If an error occurs, then there is probably something wrong with your code, or with the user's configuration, or with the hardware. In computer programming errors are bad, and often crucial.
On the contrary, with network programming errors are a very common thing. And when an error occurs, you have no hardware or OS to blame. Errors may occur e.g. when a connection is dropped, or when a server is too busy. Your application must know how to handle such an error- e.g. it can display a warning, or silently try again instantly or later etc.
Also, with network programming, you must get used to the idea of background operations. Imagine if your browser stopped responding till a web page was fully loaded… Or a 10MB file was downloaded.
With network programming, callbacks and errors are your allies. They inform you about the result of events or incoming requests. At least this is the case with asynchronous operations. As for synchronous operations (often called 'blocking' in the socket's world) I'd say don't bother, unless you like the idea of machines that stop responding till any network request (send/get reply) is completed.
Just for the record, and excepting network errors, network programming is very similar to programming applications that utilize multiple threads.

Imagine for example, that you want to apply a very heavy filter on an image.

When building a server that is expected to do heavy processing, it's good to use a new thread to do the job. Otherwise, all network operations will pause till the work is done. Since Director doesn't support multiple threads, you could consider using a second (third, etc) projector to do the job: launch a projector, pass the data to it and tell it what to do. When done, the projector will send the data back to the main movie and then quit.