Archived Entries

kbhit() equivalent in Windows MFC (VS 6.0)

Originally Posted: Wednesday, January 23, 2008

I’m just blogging this because it took me too long to find it… I needed to be able to break out of a loop by pressing a key; the example below uses the escape key. This block is within a larger loop. It samples the keyboard every 100 operations as bCount is incremented:

if(!(bCount%100)){

keyData = GetAsyncKeyState(VK_ESCAPE); //ESC KEY
if(keyData) {
break;
}
}

See the following references:

http://msdn2.microsoft.com/en-us/library/ms646293.aspx
http://msdn2.microsoft.com/en-us/library/ms645540(VS.85).aspx

UDP Checksums in Embedded, Linux, and Windows Systems

Originally Posted: Tuesday, January 8, 2008

Background:

I’m currently developing embedded and application software in order to stream large quantities of data between an embedded system (Avnet Fx-12 Mini Module) and a Host PC System. The embedded software is based on a Xilinx 9.1 Gigabit Ethernet reference design. This reference design sends and receives raw Ethernet frames by various methods – polled mode, interrupt driven, and scatter/gather DMA; but to be usable with PCs at the application layer, it must implement a standard protocol. Due to speed and resource requirements, I chose UDP.

The User Datagram Protocol (UDP) is an “open loop”, session-less means for networked device communication. UDP is open loop in the sense that it does not guarantee that the data being sent will ever arrive at its destination; using UDP is like sending junk mail via the postal system. One “assumes” that the letter will arrive at its destination; if not, the sender will not be notified, and the information will be discarded.

UDP is session-less in the sense that there is no interaction between the sender and receiver to ensure the integrity of the communication link as there is with TCP. An analogy here would be a phone call between two people…. The sender dials the recipient’s number; a conversation only ensues if the recipient answers the phone; both parties (typically) decide when to terminate the conversation.

The relative simplicity of the UDP protocol would make it an ideal choice for high-speed data transfer in low-end embedded systems – if it weren’t for the UDP checksum calculation requirement.

The Ethernet, IP, and UDP Layers

There are too many details to describe here; but in summary:

  • Each UDP packet is encapsulated first by Internet Protocol (IP) information, and then by Ethernet information.
  • At each of these layers, there is a means to determine the integrity of the data being transmitted:
  • At the Ethernet Frame(lowest) layer, the hardware ensures data integrity by validating (or generating) a checksum on the entire Ethernet frame.
  • At the IP layer, the sender must calculate the checksum of the IP header portion of the data being transmitted. Because the data in this layer is small, the processing overhead is not excessive.
  • At the UDP layer, things get interesting.

UDP Packet Processing

In theory, every UDP packet sent by an embedded system contains a checksum on the UDP portion of the data in the packet. This checksum is then used by the recipient to validate the integrity of the UDP data in the packet – the UDP payload.

In practice, the UDP checksum is calculated in two ways. In higher-end systems, special hardware and drivers offload the UDP payload checksum calculation, and inserts the checksum value into the UDP packet. But in lower-end embedded systems, the checksum must be calculated in software. This adds a huge processing burden to the system. It can turn a data “stream” from an embedded system into a trickle.

Eliminating UDP Checksum Calculations

What happens if we don’t perform the checksum calculation? RFC 1122 states the following:

  4.1.3.4  UDP Checksums

       A host MUST implement the facility to generate and validate
       UDP checksums.  An application MAY optionally be able to
       control whether a UDP checksum will be generated, but it
       MUST default to checksumming on.

       If a UDP datagram is received with a checksum that is non-
       zero and invalid, UDP MUST silently discard the datagram.
       An application MAY optionally be able to control whether UDP
       datagrams without checksums should be discarded or passed to
       the application.

So, as long as I set the UDP checksum field to zero, everything should be fine, right? Here’s what I discovered in practice:

  1. I typically use Ethereal to debug network traffic; with UDP checksums disabled, Ethereal showed all traffic being transmitted & received as expected. This applied in Linux (Fedora 8 via Wireshark), and Windows XP.
  2. In Windows XP, the checksum-less UDP packets vanished. Surprise. Packet sniffers use their own stacks. Windows elects to “silently discard” UDP packets with checksums set to zero. Try finding this documented somewhere.
  3. In Linux (Fedora 8, at least) the checksum-less UDP packets are accessible at the application layer.

Conclusion

Zero-checksum UDP packets are ideal for streaming large quantities of data from embedded systems to Linux Hosts. Windows XP cannot be used to receive this data stream. If there is a way to disable the “silently discard” behavior of the Win32 stack, I have yet to find it.

The Jumbo Ethernet Frames Standard

Originally Posted: Thursday, August 9, 2007

Recently, I’ve been using Xilinx reference design Xapp902 as the basis for a (relatively) high speed Ethernet data transfer project. Xapp902 targets Xilinx’s ML403 Evaluation Platform; the platform features the Virtex 4 FX12 device (an embedded Power PC 405 core). (jargon off)

In an attempt to conserve CPU cycles and increase data throughput, I began looking for information on Jumbo Frame implementations.

Brief Background

Data in a Local Area Network is organized hierarchically. An Ethernet or 802.2 Frame is the lowest level of data organization. And while the two are compatible, an Ethernet Frame is not the same thing as an 802.2 Frame.

Eventually, the maximum Ethernet Frame size standardized to 1518 bytes:

– 6 bytes for the destination MAC address
– 6 bytes for the source MAC address
– 2 bytes for length or type information (this is the key difference between Ethernet & 802.2)
– 1500 bytes (maximum) for data
– 4 bytes for a CRC checksum on the Ethernet Frame

At the hardware level, (the MAC and PHY layers), dedicated hardware components are responsible for transmission, reception, and error detection in these frames; however, the System CPU is generally responsible for orchestrating data handling once an entire packet becomes available.

This scheme was fine in the days of 10 Mbit/sec Ethernet, but as network speeds increase, the CPU has less & less time for orchestration, and spends an increasingly higher percentage of its time just handling network traffic. At Gigabit Ethernet speeds, packets arrive 100 times faster than they did with the old 10 Mbit/sec protocol… Enter the Jumbo Frame.

The Jumbo Frame

This problem was recognized years ago. In response, engineers proposed tweaking the meaning of the length/type field in the Ethernet/802.2 frame. They proposed letting the length/type field contain data for Frames > 1500 bytes.

By increasing the Frame size (to approx. 9 KBytes), the network processing burden on the CPU could be reduced proportionately. Makes sense, doesn’t it? The problem was that there was already too much networking infrastructure in place by then, and Jumbo Frames were largely incompatible with that infrastructure. Consequently, the Jumbo Frame proposal was rejected.

But, that didn’t stop Jumbo Frames from being implemented in systems; they’ve just been declared as proprietary. A proprietary non-standard standard. Oh boy.

I’m not sure how many non-standard implementations there are out there; in the Xilinx Ethernet device driver code I’ve seen, there are some values that can be changed in the MAC hardware to allow receipt & transmission of Jumbo Frames. But it looks as if the length/type field must be used to indicate Frame size.

Of course, the resulting packets would non-standard, which means that they could be rejected by other devices on the network. Not good if I need to move data through switches and routers.

Given the hassles described above, I’ve decided to stick with regular 1518 byte Frames; but, I would like to know if and how other devices and systems are handling Jumbo Frames. Do they just “flip the Jumbo hardware switch” and assume that the length/type field represents the Frame size?

My understanding is that Jumbo Frames are part of the Gigabit Ethernet standard, but I haven’t seen anything documenting this.

NETWORKED SENSOR SYSTEM (NSS)

Originally Posted: December 2006

We designed the NSS for security applications. It consists of three modules (described immediately below). They intercommunicate via the internet.

GIS Clients (PCs) subscribe to data from one or more deployed Networked Sensors via the Sensor Bridge.

The Sensor Bridge aggregates data from the Network Sensors. It can be hosted on a GIS Client PC, or remotely on any internet-connected PC.

The Network Sensors transmit data when an Event occurs.

The prototype Network Sensor (NS) has a laser and video camera attached to a 5′ umbilical. When the laser beam is interrupted, the NS notifies the Sensor Bridge that an Event has occurred, captures a frame of video (a “still image”), and retains its logged data. These data are uploaded to the Sensor Bridge.

The Event data appears on the GIS Client Display. The User can display a satellite image representating the Event location, along with data transmitted by the Networked Sensor. As of 2006, a video frame (rendered as a jpeg) and accelerometer signature data are provided.

I finally got a chance to package the prototype. I had a breadboard system cobbled together on the bench while I worked on the embedded software. It’s been fun to shift gears, and get my hands dirty…

The main components of the Networked Sensor are:

– Microcontroller SBC: (Rabbit BL2000)
– Frame grabber card: (uCFG) with eval MBoard
– Laser break-beam sensor
– Video Camera

I’m working on the next-generation NS now; the main goal is hardware cost reduction, as these prototype parts are expensive. I’m replacing the BL2000 with an RCM series uController, and designing an interface board. These changes should bring down the cost by over 50%.

Leave a Reply

Your email address will not be published. Required fields are marked *