Beruflich Dokumente
Kultur Dokumente
Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections,
modifications, enhancements, improvements, and other changes to its products and services at
any time and to discontinue any product or service without notice. Customers should obtain the
latest relevant information before placing orders and should verify that such information is current
and complete. All products are sold subject to TI’s terms and conditions of sale supplied at the
time of order acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of
sale in accordance with TI’s standard warranty. Testing and other quality control techniques are
used to the extent TI deems necessary to support this warranty. Except where mandated by
government requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are
responsible for their products and applications using TI components. To minimize the risks
associated with customer products and applications, customers should provide adequate design
and operating safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any
TI patent right, copyright, mask work right, or other TI intellectual property right relating to any
combination, machine, or process in which TI products or services are used. Information
published by TI regarding third party products or services does not constitute a license from TI
to use such products or services or a warranty or endorsement thereof. Use of such information
may require a license from a third party under the patents or other intellectual property of that third
party, or a license from TI under the patents or other intellectual property of TI.
Resale of TI products or services with statements different from or beyond the parameters stated
by TI for that product or service voids all express and any implied warranties for the associated
TI product or service and is an unfair and deceptive business practice. TI is not responsible or
liable for any such statements.
Mailing Address:
Texas Instruments
Post Office Box 655303
Dallas, Texas 75265
- Chapter 3 – TCP/IP Stack Code and Data Size, lists the code size of vari-
ous TCP/IP stack components compiled for the C6211 with compiler opti-
mization level 2.
Trademarks
iv
Contents
Contents
vii
Tables
Tables
2–1 Standard Sockets API vs. “No-Copy” API Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2–2 Performance and CPU Loading with Different Driver Modes Tests . . . . . . . . . . . . . . . . . . . 2-4
2–3 Normal Network Load Conditions Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
3–1 Base Code Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3–2 Optional Component Code Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3–3 Data Memory Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
viii
Chapter 1
This chapter describes the design, programming, and service features of the
TCP/IP stack.
Topic Page
1-1
Design Features
1-2
Programming Features
TCP, IP, and Socket properties are fully configurable with the setsockopt()
function. There are also additional enhancements to socket layer options to
support protocols that have historically required kernel modifications like
DHCP, private broadcast based servers, and traceroute.
- Telnet server
1-4
Chapter 2
Topic Page
2-1
Peak Performance Numbers
Running on the TMS320C6211 DSK and the LogicIO/Macronix MAC with the
standard sockets API and classical sockets programming, the stack gets
about 27Mb/s on a TCP stream for receive operations and 29Mb/s on a TCP
stream for transmit.
When using the “no-copy” API, the stack gets 34Mb/s on a TCP stream for re-
ceive with the same hardware. UDP performance should be a bit faster since
its simpler and doesn’t copy packet data for either TX or RX operations.
Note:
The performance of the TCP/IP stack when using the LogicIO ETHC6000 is
somewhat bound by the long bus cycle access time of the Ethernet
hardware.
2-2
MIPS Consumption
The tests described in this section were run using the NDK on a 150Mhz
TMS320C6711DSP.
TCP Socket Type API Calls Used Sustained Data Rate CPU Loading
SOCK_STREAM select(), recv() 26.6 Mb/s 70.0%
2.2.2 Part 2: Performance and CPU Loading with Different Driver Modes
The following tests used the same benchmark with the recvnc() described
above, along with a similar data benchmark to test data transmission. Here the
performance and CPU loading of all three sample drivers can be compared
and contrasted. On EMDA version of the driver, the CPU is freed to perform
other tasks during EDMA operations.
Table 2–2. Performance and CPU Loading with Different Driver Modes Tests
Driver Mode Used TCP Operation Sustained Data Rate CPU Loading
Polling Mode Receive 26.0 Mb/s 78.3%
Driver Mode Used TCP Operation Sustained Data Rate CPU Loading
EDMA Mode Receive 7.57 Mb/s 11.2 %
2-4
TCP/IP “No-Copy” Socket Options
By default, neither UDP nor RAW sockets use send or receive buffers. Howev-
er on receive, the sockets API functions recv() and recvfrom() require a data
buffer copy because of how the calling parameters to the functions are defined.
In the stack library, two alternative functions (recvnc() and recvncfrom()) are
provided to allow an application to obtain received data buffers directly from
the stack without a copy operation. When the application is finished with the
data, it returns the buffers to the system by calling recvncfree().
By default, TCP uses both a send and receive buffer. The send buffer is used
since the TCP protocol can require “reshaping” or retransmission of data due
to window sizes, lost packets, etc. On receive, the standard TCP socket also
has a receive buffer. The receive buffer is used to coalesce data received in
packet buffers. Coalescing data is important for protocols that transmit data in
very small bursts (like a telnet session).
For TCP applications that get data in large bursts (and tend not to use flags
like MSG_WAITALL on receive), the TCP receive buffer can be eliminated by
specifying an alternate TCP socket stream type of SOCK_STREAMNC. With-
out the receive buffer TCP will queue up the actual network packet buffers con-
taining receive data instead of coalescing the data into a receive buffer. This
eliminates a data copy. TCP sockets that use the SOCK_STREAMNC stream
type are 100% compatible with the standard TCP socket type.
Note that care needs to be taken when eliminating the TCP receive buffer
since large numbers of packet buffers can be tied up for a very small amount
of received data. Also, since packet buffers come directly from the Ethernet
driver in the HAL, there may be a limited supply available. If the MSG_WAI-
TALL flag is used on a recv() or recvfrom() call, it is possible for all packet buff-
ers to be consumed before the specified amount of payload data is received.
This would cause a deadlock situation if no socket timeout is specified.
Sockets that use SOCK_STREAMNC have an added benefit in that they can
also be used with the recvnc() and recvncfrom() functions that UDP and RAW
sockets use to eliminate the final data copy from the stack to the sockets ap-
plication. Using these “no-copy” functions with SOCK_STREAMNC elimi-
nates both data copies used by the standard TCP socket. Note that when
recvnc() and recvncfrom() are used with TCP, out of band data is not sup-
ported. If the SO_OOBINLINE socket option is set, the out of band data is re-
tained, but the out of band data mark is discarded. If not using the inline socket
option, the out of band data is discarded.
2-6
Chapter 3
This chapter lists the code size of various TCP/IP stack components compiled
for the C6211 with compiler optimization level 2.
Topic Page
3-1
Base Code Size
3-2
Optional Component Code Size
3-4
Chapter 4
This chapter contains information for the ETHC6000 ethernet hardware and
driver.
Topic Page
4-1
Ethernet Hardware Information
The Ethernet DSK daughter card used in these tests was designed by LogicIO
using a Macronix MX98728EC MAC chip (GMAC). The design of the Ethernet
driver is partially determined by the operation of this hardware.
The EMIF timings required to access the chip are quite sensitive, and ac-
cesses that are too aggressive will return invalid data. The EMIF settings for
this board are 0x34a31026 which translates out to 23 cycles per write and 25
cycles per read on a 100Mhz external interface. The board thus has a theoreti-
cal performance limit of 128Mb/s on receive, and about 58Mb/s on transmit
(the board can not send while the TX FIFO is being filled – thus 139Mb/s FIFO
and 100Mb/s line rates are combined).
The total theoretical full duplex throughput of the card is bound by the transmit
rate. However, since the same memory interface used by the Ethernet card is
also used for other CPU operations (like cache line loads), the practical perfor-
mance limit is about half. The board has been measured at about 39Mb/s full
duplex on internal loopback tests.
4.1.2 Resources
4-2
Ethernet Hardware Information
Depending on whether or not the CPU or the EDMA is used for fetching receive
packet data, the INT5 signal may or not be used. It is not necessary when the
CPU is used to read the data, since the status of this signal can also be deter-
mined from a GMAC register bit.
Also, for drivers designed to poll for incoming packets, the INT4 signal is not
necessary since the CPU is able to compare a head and tail pointer on the
GMAC device to determine if packets are available.
Although the GMAC provides an 8 bit register interface, the chip can only be
accessed as a 32bit device by the DSP. This requires the driver to view the indi-
vidual registers as groups of “super registers,” each containing 4 of the in-
tended register values. This can get clumsy, and in certain cases may limit the
functionality of the device. Luckily, the problem does not prevent the device
from operating in a fashion compatible with the DSP.
Also, the GMAC device signals interrupts to the CPU in a level sensitive fash-
ion, while the DSP only supports edge triggered interrupts. If the GMAC inter-
rupt signal were to stay low in a back-to-back interrupt, the DSP may become
out of synch and unable to detect additional interrupts. The software device
driver must therefore incorporate a periodic polling routine that includes han-
dling lost interrupt conditions.
The simplest form of the Ethernet driver is the polling driver. This driver uses
a special “polling mode” of the stack library that invokes a single low priority
task to service the Ethernet device along with any other required background
tasks. This environment has the benefit that no interrupts are required to run
the network and hence real-time tasks can be deterministically scheduled. The
drawback to this approach is that the system programmer must integrate stack
polling code with any other background tasks in a single low priority “idle” task.
As an alternative to polling, the driver can also run in an interrupt driven “sema-
phore” mode. Here, the Ethernet device’s polling function is called twice a sec-
ond instead of constantly, and it is the driver’s responsibility to use interrupts
to determine when network packets are available. This allows the CPU to
schedule time to the stack only when packet activity occurs. The driver indi-
cates packet events by signaling a global semaphore. This causes the half
second polling routine to be called early.
The advantage of this method is the stack can run independently of any low
priority “idle” task. The disadvantage is that the CPU must now take on the
overhead of one interrupt per transmit, and possibly one interrupt per receive.
Still, using the interrupt mode increases overall throughput.
On both transmit and receive operations, the EDMA controller can be used to
copy data to and from the GMAC device instead of the CPU. In general, using
the EDMA is faster than using the CPU. This is especially true on the GMAC
device for receive, where each 16 byte burst requires a polling loop for verify-
ing a “data ready” condition. The data ready pin is also hooked to INT5 on the
DSP, and the DSP’s frame synced EMDA handle multiple bursts of 16 bytes
well, without the need to interrupt the CPU
4-4
Ethernet Driver Information
The main advantage in using the EMDA controller is performance and lower
CPU overhead. The disadvantage is that the driver ties up more system re-
sources. In addition to the normal device interrupt (INT4), this driver uses two
EDMA channels (channels 4 and 5), an EDMA synchronization interrupt from
the GMAC (INT5), and a CPU interrupt for the DMA completion signal (INT8)
- EDMA driver that uses INT4, INT5, INT8, and two EDMA channels