This post is mirrored on my blog.
In order to effectively write applications that communicate via sockets, there were some realizations I needed to make that weren't explicitly told to me by any of the documentation I read.
If you have experience writing applications using sockets, all of this information should be obvious to you. It wasn't obvious to me as an absolute beginner, so I'm trying to make it more explicit in the hopes of shortening another beginner's time getting their feet wet with sockets.
TCP reliability vs. application reliability
TCP guarantees reliability in regards to the stream; it does not
guarantee that every send()
was recv()
'd by the connection. This
distinction is important. It took me a while to realize it.
The core problem I was trying to solve is how to cleanly handle a network partition, which is when a machine A and another machine B become completely disconnected. TCP, of course, cannot ensure your messages are delivered if the machine is off, or disconnected from the network. TCP will keep the data in its send buffers for a while, then eventually time out and drop the data. I'm sure there's more to it than that, but from an application perspective that's all I need to know.
The implications of this are important. If I send()
a message, I have
no guarantees that the other machine will recv()
it if it is suddenly
disconnected from the network. Again, this may be obvious to an
experienced network programmer, but to an absolute beginner like me it
was not. When I read "TCP ensures reliable delivery" I mistakenly
thought that meant that e.g. send()
would have blocked and returned
success after the recv()
end successfully received the message.
Such a send()
could be written, and it would then guarantee at the
application level that the messages definitely got to the receiving
application, and that they were read by the receiving application.
However, this would grind application interaction to a halt, because
every call to this send()
would cause the application to wait for
confirmation that the other application received it.
Instead, we hope that the other application is still connected, and fill
up one or many send()
calls in a buffer that TCP handles for us. TCP
then does its best to get that data to the other application, but in the
event of a disconnect we effectively lose all of it.
Application reliability
Application developers need to decide how their application reacts to unexpected disconnects. On each bit of data you send, you need to decide how hard you'll try to know it actually got to the receiving application.
The counter-intuitive thing about this is it means implementing acknowledge messages, labeling messages with IDs, creating a buffer and system to re-send messages, and/or (depending on the application) possibly even timeouts associated with each message. That sounds a lot like TCP, doesn't it? The difference is that you are not dealing with the unreliability of UDP like TCP is. You are dealing with the unreliability of networked machines staying on and connected in general.
It may seem annoying that you may need to implement all of these things, but it does allow your application to gain some interesting abilities:
- You can store packets of data on the hard drive, then if the entire application or machine crashes, you can still try to send that data when everything starts back up.
- You can allow disconnects during long-running operations, then when the two machines eventually reconnect, the operation's results can be shared.
- You can decide on a case-by-case basis how hard you want to try to confirm delivery. For example, I might try very hard to report a long operation is completed, but I might not care as much about dropping the data which reports the operation's progress over time. The former might make the user think they must re-run the potentially costly operation, but the latter might just make a progress bar move a little more erratically.
You may not need to care about application-level reliability. Many
applications simply exit when a disconnect occurs at an unexpected time.
In my case, I wanted my applications to gracefully continue by
attempting to reestablish the connection every so often. This meant I
needed a separate reconnect loop which would sleep
for a bit, then
attempt to reconnect and resume normal operation if successful.
I have not implemented the application-level reliability layer in my application yet because I am not too concerned if any of the data isn't eventually received. This is a decision that must be made on a case-by-case basis, however. If, for example, I run a build that takes two hours, but the "build success" message is dropped due to a disconnect, I might end up wasting another two hours re-running the build unnecessarily. If I had application-level reliability, I would know that the build succeeded. The trade-off to implementing this is added development time and system complexity, but it may be worth it.
recv()
and SIGPIPE
I found it very confusing that I had to attempt to recv()
from a
socket and fail in order to even tell that the connection was no longer
active. I expected that I would call e.g. isconnected()
on the sockets
after accept()
tells me something happened to it. It does make sense
to me now that it's better to have recv()
fail and tell me about the
disconnect. Otherwise, I might mistakenly assume that if I call
isconnected()
I am then guaranteed to have a good recv()
. By keeping
the disconnect tied to recv()
failing, I know I need to handle
potential disconnects at any recv()
invocation. The same goes for
send()
.
On Linux, I also needed to disable signaling on the recv()
so that I
could handle the connection error inline rather than need to register a
signal handler. I opted to add the MSG_NOSIGNAL
to both send()
and
recv()
and handle potential disconnect errors at each call. This might
not be as idiomatic on Linux, where a signal handler might be more
common, but it gives me a bit more control as an application developer.
It also works better when I port to Windows, which doesn't use signals
to report disconnects.
Don't use Linux "everything is a file" APIs with sockets
Linux allows you to treat sockets as if they are file descriptors. This is neat because you can then make your application support streaming to/from a file or a socket with the same code.
However, Windows does not treat sockets the same as files. If you want
to use native Windows APIs, you must use the functions dedicated to
them: send()
, recv()
, closesocket()
, etc.
I would argue that the Linux abstraction should not be used from a robustness standpoint. How you handle a file no longer existing vs. a socket disconnection are not likely to be very similar. I'm sure I'll get counter arguments to this, and that you should write your applications to treat these the same. I care about strong Windows support, so even if I'm wrong, my hands are tied anyways.
You could of course write your own abstraction layer for these, but again, the performance and reliability factors of files vs. sockets are quite different. It seems if you can treat them differently, you should, if only for the awareness and control. I will also ask: how often are you writing applications that want to accept either files or sockets? In my experience that sort of thing is a definite minority of cases. I usually know where my data is going, and usually want to know so that I can make more educated decisions about performance.
The application's main select()
loop
The application knows when it needs to write to a socket. It does not
necessarily know when it needs to read from a socket. This means that I
should only add sockets to the write list of select()
when I have a
message ready to send. I should always add all sockets to the read list
of select()
if I want the application to be flexible to receiving
messages at any time.
If there are several rounds of back-and-forth that need to happen for a
single operation, I could still code that in, but it becomes less
flexible. It is easier to try to keep it to a single send, then handle
the receive in the main select
loop. This might require storing state
in your metadata associated with each connection, or adding IDs to
messages to associate them with other state.
By keeping rounds of select()
to only sends or only receives on each
socket, you handle multiple connections better. For example, you can
send an order to start a long operation on another machine, then receive
messages from other connections while the long operation is running.
Otherwise, you would have to put the long operation's send and receive
code on another thread or something to allow for other connections to be
handled.
It is less of a concern if you e.g. receive a request, then can quickly
put together and send a response. In those cases, you might as well just
receive and send in the same iteration of select()
on that connection
to keep things simple. If the receiving application is coded with a
similar setup, they also can decide whether to receive right after they
send or go back into their select()
loop.
Sockets are still cool
It took a while for me to understand what I needed to write applications to use sockets effectively. Now that I have paid that price, it feels like I've gained a new super power.
I felt similar feelings when I learned how to run sub-processes, and when I learned how to load code dynamically[^1]. These things break down barriers and open doors to new and exciting functionality.
While I have spent much longer than I expected building the project which required me to learn sockets, I am glad I did.
[^1]: If you haven't learned these, you really should. Here are the
functions, to give you something to search: For running
sub-processes, on Windows: CreateProcess
, on Linux: fork
, exec
.
For dynamic loading, on Windows: LoadLibrary
, GetProcAddress
, on Linux: dlopen
, dlsym
.
If you want to load code without using dynamic linking, you'll want
to learn about virtual memory and mmap()
(Linux) or
VirtualAlloc()
(Windows).
By using both sub-process execution and dynamic loading, you can have applications e.g. invoke a compiler to build a dynamic library, then immediately load that library into the same application. This is one way you could allow your users to modify and extend your application while it stays running.