|Presented in 2017 International Conference on Recent Advances in Electronics and Communication Technology
978-1-5090-6701-5/17 $31.00 © 2017 IEEE DOI 10.1109/ICRAECT.2017.11
Abstract—The Universal Serial Bus 3.0 (USB 3.0) is one of the highly acclaimed protocols in the field of communication. Speed plays an important role as a performance parameter, especially in the SuperSpeed mode of USB 3.0. Alongside transfers such as Interrupt, Control, Bulk Streaming and Isochronous, Bulk transfer associates its performance with data rate due to transfer of large data between communicating ports. With the advent of new technologies, there is always a requirement for achieving a value closer to the design speed of 5 Giga Bits Per Second theoretically or around 400 Mega Bytes Per Second practically. Two generic algorithms are proposed for the Bulk IN and OUT transactions separately for speed analysis. A mathematical relationship is also introduced concerning speed computation. The methodology has been made simple with the conversion of algorithms to flow charts. They are mainly helpful in estimating speed values and prove effective in quantifying the affecting factors. They also provide a quicker and easier way to arrive at the results for comparison. This paper gives an insight of the details of the algorithms, flowcharts and related formula developed to calculate speed for the USB 3.0 SuperSpeed Bulk transactions.
Index Terms—Bulk IN, Bulk OUT, Transaction time, Data length, Number of packets, Types of packets, Speed based Performance, Algorithm, Mathematical Representation, Flowcharts
USB 3.0 protocol specification mentions a theoretical value of 5 Gbps for data rate or speed. However, considering the design constraints, the practical value is expected to be around 400MBps. Hence the goal of any USB 3.0 SuperSpeed product should be at-least nearer to the proposed practical value. In-order to achieve this, an estimation and computation of speed-based performance for USB 3.0 Super-Speed host or device becomes essential along with the design approach towards the same. The data rate calculation helps in verifying how efficient the design implementation is, so that it can conform to the specifications.(1)(2)(3) Considering the above points, two generic and simple algorithms are proposed with a pre-requisite knowledge of design and data flow of the USB 3.0 SuperSpeed Bulk transactions for both IN and OUT directions. This includes identification and estimation of associated packets such as Not Ready (NRDY), Endpoint Ready (ERDY), Acknowledgement (ACK), End Of Burst (EOB), Zero Length Data Packet (Empty DP), Terminating ACKs (ACKs with NumP=0), etc. which are responsible for flow control mechanisms as well as data transfer mechanisms. The interdependence of causative factors are taken into account for all types of design scenarios which includes both erroneous and non erroneous bulk transfer. This paper basically defines the theoretical parameters of concern for the speed based performance and then moves forward to propose individual algorithms for Bulk IN and OUT transactions. The algorithms apply for all cases in general and are represented in the form of flow charts for simplicity supported by a mathematical model that consists of equations to arrive at the solution.
THE BULK IN AND OUT TRANSACTIONS
The Bulk transaction is basically, transfer of large data from one endpoint to another, usually one of them being a host and the other a device. It provides guaranteed data transfer but not latency or bandwidth.(4)(5)(6) The basic bulk transfer consists of Data Packets (DP) and associated Transaction Packet (TP) ACKs as responses to reception of DPs. The data transfer may be delayed with the occurrence of other TPs such as the NRDY, ERDY, ACKs with NumP=0 etc due to entry into flow control mechanism. The bulk transaction can be classified into two types:
- Bulk IN
This refers to data transfer from the device endpoint to the host denoted as upstream. The data packets are represented as DP IN as observed from the host side. They are associated with the data packet payloads of variable lengths. The host initiates the data flow through the ACK signal request to the device to start sending the data along with fields such as number of packets it can accept, etc. before the reception of the first DP IN. The device responds with DP IN and continuously sends the same without waiting for reception of ACKs. The device responds with NRDY when not ready for the transaction and consequently with ERDY to resume communication.(7)(8) However, sometimes the device also responds by sending DP IN with EOB field set to 1 indicating termination of the respective burst to which the host responds with an ACK. The introduction of Empty DPs that are of zero length payload will terminate the data stage. The above flow holds good for non erroneous transfer. If there exists any error such as Header Sequence Number mismatch error, CRC-16 error etc., then the host sends an ACK with Retry (rty) bit set to 1 as a response. The device retransmits the DP again. Thus the entire transaction is considered terminated after the reception of last DP IN by the host and subsequent transmission of ACK to the device. Figure 1 shows the pictorial representation of Bulk IN flow mechanism Thus, following are the parameters taken into account for speed based performance of the Bulk IN transaction –
- The number of NRDY and ERDY packets occurring during the transaction
- The number of Empty DPs (if generated) during the transaction
- The number of DP INs with EOB field set to 1 during the transaction
The number of NRDYs and ERDYs have a one to one relationship with each other. When the device sends NRDY, it takes some amount of time to send ERDY after which data stage is resumed. This delay shall be taken into account for de-teriorating data rate. The Empty DPs may be generated in the design to reduce idle time.(9) However, since every Empty DP is also responded with an ACK, an increase in both parameters causes reduction in speed. The DP IN with EOB=1 is used to terminate the transfer for the respective burst. It is responded by a ACK with NumP=0 from host and consequently ERDY, again the delay shall be of importance, thus affecting the speed based performance in Bulk IN transaction. The number of retransmitted DP INs are proportional to the number of retries (ACK with rty=1) for erroneous condition and hence contribute towards reducing speed.
- Bulk OUT
In contrast to the previous type, this is the data transfer from the host to the device endpoint denoted as downstream.
The data packets are represented as DP OUT again taking host side as the reference. They are also associated with data packet payloads and transaction packets that are similar to the previous case such as the NRDY, ERDY and ACKs. The flow is again initiated by the host where it starts sending DP OUT to the device and the device responds with the ACK. The host need not wait for the previous ACK to send the next data packet.(10)(11) The device may again respond with NRDY and consequent ERDY to communicate that it is not ready and endpoint ready respectively. However, as it enters control flow condition, the device may generate ACK with NumP=0 to indicate its inability to accept DP OUTs from the host.(12)(13) This terminates the data stage and shall be followed by an ERDY. Figure 2 shows the pictorial representation of Bulk OUT flow mechanism.
Thus, following are considered as the parameters that affect the speed based performance of the Bulk OUT transaction –
- The number of NRDY and ERDY packets occurring during the transaction
- The number of ACKs with NumP=0 generated by device during the transaction.
The device responds with ACK with NumP=0 if it can no longer accept data, depicting end of transaction. This is responded with ERDY to resume the transfer. The interdependence of terminating ACK and ERDY is observed in Bulk OUT transaction. (14) Based on the above theoretical concepts, the next section proposes the algorithms and the constraints taken into consideration for the estimation and calculation of speed.
ALGORITHMS FOR SPEED ANALYSIS
The algorithms for the Bulk IN and OUT transactions are discussed independently due to the change in the way these transfers occur. The pre-requisites of the algorithm development involves a brief list of the following parameters –
- Data-length – It is defined the length or size of the data packet payload of each of the data packet. The range of data length varies from 1 to 1024 Bytes with fixed, random or variable assignments. The data length can be assigned a value greater than 1024 Bytes but the value being maximum for each packet, the remaining Bytes are sent in the next packet. The data length may be same for all packets or may vary for each packet. For short packets, data length is taken to be less than the maximum packet size. The data length field is ascribed for every DP accordingly.
- Number Of Packets – It is defined as the total number of independent data packets that are sent or received during the transaction. They are always considered such that they are non zero length or non-empty. They are considered only if they are successfully transferred across the channel. The size may be same for all packets or variable individually. The data length of each packet is bounded by burst size and bytes per interval fields. The number of packets is a consequence of data length because every extra bytes greater than maximum size as specified for a burst and bytes per interval for each is sent in the next packet, thus increasing the count of packets.
- Total Transaction Time – This is defined as the total time during which the transfer of required data or transaction takes place. This is marked with a boundary between the start and stop time instance and the difference of both is taken to obtain a mathematical value. The total transaction time can also be defined as the average time interval during which the entire transfer of successful data takes place between communicating ports. The points of reference for start and stop are dependent on the direction of Bulk transaction.(15)(16)(17)
Basically, data rate or speed of any communication is defined as the rate of transfer of data. In general, it is the amount of data transferred in unit time. This approach is applicable to the Bulk transactions so that speed is obtained according to the measurement of choice. The algorithms are discussed as follows –
- Bulk IN – Algorithm
- The start point of the transaction is always attributed to the host initiation. Hence the start time of the transaction is taken at a time instant where the host sends the ACK request to device before the reception of first DP IN. This point shows the initialization of transaction. This is denoted as tstart−Bulk−IN . The start time, once taken as initial value shall not be updated and shall be retained till the completion of transaction.
- The stop point of the transaction is always attributed to the termination of the transaction. Hence the stop time is taken at the time instant where the host receives the last DP IN and responds with an ACK. This time instant is denoted as tstop−Bulk−IN . This is updated to the final value after every data stage till transfer completion.
- The difference between the start time tstart−Bulk−IN and stop time tstop−Bulk−IN gives the total transaction time. This time difference is represented by tdif f −Bulk−IN . It shall include all types of packets such as DP INs or TPs that occur in between the whole transaction. Even under erroneous transfer, the time interval for retries and retransmissions shall be included in this parameter. The mathematical representation is as shown in Equation 1
t dif f −Bulk−IN = t start−Bulk−IN – t stop−Bulk−IN (1)
- The data length of each data packet is considered as dl. This corresponds to the size or data packet payload. This value is taken constant if all packets have same data length. Otherwise, individual dl values for each packet are considered. This parameter shall be counted only for valid data and not for Empty DPs or erroneous DPs.
- The number of packets is calculated taking into account the total number of successful DP INs received in the total transaction time tdif f −Bulk−IN . They are represented by np . For uniform data length of all packets, np is taken as a whole. Otherwise, it is taken as the count of packets with respect to the corresponding dl . This parameter shall also be counted only for valid data and shall be discarded for erroneous or Empty DP. A separate parameter may be introduced to estimate the number of Empty or erroneous DPs because of their causative effect in deteriorating speed.
- Speed or data rate of Bulk IN is the number of DP INs transferred from device to host in average time interval. It is denoted by SpeedBulk−IN . The mathematical representation for speed computation is as shown in the following equations: If the transfer involves np packets with variable dl values, then Equation 2 holds good.
Speed Bulk−IN =∑1024dl=1 [dl* npdl] / (t diff −Bulk−IN ) (2)
where npdl represents the number of packets corre-sponding to data length dl . The summation refers to the inclusion of product of count of packets with variable data length value. If the transfer involves packets of uniform data length, then Equation 3 holds good.
Speed Bulk−IN = (dl * np) / ( t dif f −Bulk−IN ) (3)
- The number of NRDY, Empty DP INs and DP IN with EOB field set to 1 are computed based on the identification of the fields as presented in the USB 3.0 specification. The NRDY and ERDY are estimated based on the Subtype field in the TP while the Empty DP INs are identified based on data length field. The estimation of these causative parameters will be helpful to characterize the performance through comparative analysis.(18) For erroneous transfer, the number of ACKs with rty =1 and erroneous DP IN are taken into account. However, the equations to compute speed of the transaction remains same for all cases. Hence it can be understood to be generic in nature covering vast amount of design scenarios and constraints.
Figure 3 shows the flowchart depicting the methodology involving speed computation for Bulk IN transaction. It holds good for both error and error-less conditions and is hence generic. All the parameters affecting speed are included in the flowchart to cover minimum scenario constraints.
- Bulk OUT Transaction – Algorithm
- The start point of the transaction is attributed again to the host initiation. Hence start time is taken as the time instant at which the host sends the first DP OUT to the device. This is denoted by tstart−Bulk−OU T . This corresponds to the initial time instant value and should not get updated during the entire flow.
- The stop point is ascribed to the end of transaction where the device sends ACK to the last DP OUT sent by the host. This time instant is denoted by tstop−Bulk−OU T. This time instant is the final value at termination of transfer and has to get updated after each data stage.
- The difference between start time tstart−Bulk−OU T and stop time tstop−Bulk−OU T gives the total transaction time denoted by tdif f −Bulk−OU T . This time interval is the average time which includes all packets such as DPs and TPs. They include erroneous DP OUTs and retries/retransmitted DP OUTs during the transfer. The time difference is mathematically represented as in Equation 4.
t dif f −Bulk−OU T = t start−Bulk−OU T –t stop−Bulk−OU T (4)
- The data length dl of each packet and number of packets np are calculated similar to the Bulk IN case. However, Empty DPs are not generated here. Erroneous DPs are considered in error based transfer. The consideration of uniform and non uniform dl cases is similar to Bulk IN. The computation of speed or data rate is depicted by SpeedBulk−OU T and given by the mathematical representation in the following equations. If the transfer involves np packets with variable dl values, then Equa-tion 5 holds good.
Speed Bulk−OU T = ∑1024dl=1 [dl * npdl] / (t diff −Bulk−OU T ) (5)
where npdl represents the number of packets corresponding to data length dl . If the transfer involves packets of uniform data length, then Equation 6 holds good.
Speed Bulk−OU T = (dl*np) / (t dif f −Bulk−OU T ) (6)
- In addition to NRDY, ERDY and with exclusion of DP with EOB=1 cases as in Bulk IN transfer, the calculation of ACKs with NumP=0 or terminating ACKs plays a major role in speed based performance analysis of Bulk OUT transaction. The same procedure is carried out for NRDY and ERDY. However, terminating ACKs are identified through NumP field reset to 0. In case of error conditions, ACK with rty=1 is also considered. Here too, the equations for Bulk OUT remain generic for all cases covering major constraints.
Figure 4 depicts the flowchart of speed computation for Bulk OUT transaction. It differs from the previous Bulk IN flow chart. It is applicable for error and error-less condition. All the causative factors of speed characterization are shown in the flowchart for basic scenario coverage.
The algorithms described above are subject to the following considerations –
- The time values may be considered in any measurement units such as nano or pico seconds appropriately with respect to obtaining speed in the desired unit.
- The data length is usually measured in terms of bits or Bytes in-order to obtain the speed in Gbps or MBps re-spectively. The number of packets are subject to variation in case of short packets or size greater than maximum value. Hence, data length in terms of bits or bytes shall be taken for precision.
- Consequently, speed measurement may be variable in terms of units either in no of packets per second, MBps or Gbps and can be manipulated appropriately.
- Some design constraints such as burst size, bytes per interval, repeat values etc. must be considered as any change in those fields affect the data length of packets.(19)(20)
- The algorithms are independent of each other and hence can be implemented in Bulk IN-OUT transactions.
- The numbers of endpoints are to be taken into account so that speed is calculated for each of their transfers.
There might be variation in identification due to many factors such as bus configurations, DMA access etc. However, the following extensions can be done in the form of tabulations in-order to infer at the behavior of speed of the constraints that are generated –
- The maximum and minimum values of speed for overall constraint based analysis.
- The maximum and minimum values of speed for each of the constraints applied independently.
- The corresponding values of affecting factors for both the above cases.
- The ideal cases where practical speed is achieved with almost zero causative factors.
- The relationship between individual causative factors can be established.
ADVANTAGES AND RELATED ISSUES
The proposition of algorithms for estimation and computation of speed based performance of USB 3.0 SuperSpeed Bulk transaction provides a base for analyzing the efficiency with respect to data rate of the IN and OUT transfers respectively. The algorithms are identified with similar flow but distinct parameters and identifiers thus providing a solution to tackle speed related issues through analysis.(21)(22)(23)The algorithms are straightforward and simple as seen from the flowcharts and any ambiguities are warded off through mathematical representation of the speed estimation process. The accessory parameters of reference are also helpful in arriving at best and worst case results.(24) The algorithms themselves act as independent designs yet help in verifying whether the implementation conforms to the protocol specifications. An inter-relationship between causative factors helps to statistically ascertain the flow of the data. Additionally, this provides a faster method to analyze the speed based performance of the product without much tedious effort. The basis of characterization is quality check of the product in the dimension of quantitative estimation. The algorithms also involve some issues that need to be taken care of during implementation. A major point of concern is to implement them in the form of modules or codes to effect their utilization for the design under test. All scenarios need to be tested with extra care on the identification of start and stop points of transaction. The misidentification of signals may lead to mixing up of signals thus leading to inaccurate results. Time synchronization is another related issue. The implementation should be done in accordance to the processor clock, missing which there might be loss of packets. The above algorithms may not be the fastest in terms of computation time but are definitely better than manual derivation due to their simplicity.
The USB 3.0 SuperSpeed Bulk transaction is mainly attributed with speed as a standing criterion for performance measurement. The above algorithms prove to be beneficial for analyzing the different constraints associated with a large variety of test cases dealing with bulk transfer.(25)These methods can also be extended towards ascribing the nature and characteristics of speed values in cases involving both IN and OUT transactions with different endpoints. A statistical analysis of the same can be done so that corner cases can be derived. The tabulations also act as data sheets to associate the design products under analysis.(26)(27) This helps to obtain a relation between the causative factors and speed based performance under different criteria. The design products may be verified through coding of the algorithms and implementing them through the design modules. The simulations shall give a reference model to characterize the efficiency. The algorithm being generic, can be extended to other types of transfers also. Furthermore, it could give rise to manual as well automated techniques of speed estimation and computation for any USB transaction. A separate module may be designed in the form of speed analyzer that automatically generates results based on test cases with specific runs.
The generic algorithms governing the basis of speed estimation and computation for Bulk IN and OUT transactions in USB 3.0 SuperSpeed protocol are presented in this paper. The points of reference pertaining to the parameters responsible for speed analysis have been discussed. The flow mechanisms for the bulk transactions have been described with appropriate causative factors. The interaction of different packets during the flow has been elaborated in the pre-requisites of the algorithm and suitable flowcharts depicting the same have been shown. The mathematical representations in the form of equations supporting these algorithms have been presented. A proposal has been placed through algorithms towards speed estimation, computation and performance characterization of causative factors in this paper.
The authors would like to acknowledge their thanks to Mr. Anurag Kumar and Mr. Akshay Patil, Design and Verification Engineers, and all other team members from Innovative Logic Inc for their valuable technical and non technical support.
 A. Tomar and E. Linn, “Introduction to superspeed usb3.0 protocol,” vol. 1, pp. 1–20, April 2011. [Online]. Available: https://www.element14.com/community/ servlet/JiveServlet/previewBody/36206-102-2-218482/ Introduction%20to%20USB%203.0%20Protocol.pdf
 N. Chang, “Usb female connector,” GEN US7 927 145, April, 2011.[Online]. Available: https://www.google.com/patents/US7927145
 F. Chao, “Usb 3.0 vs usb 2.0 ys esata,” December 2011. [Online].Available: http://aztcs.org/meeting notes/ winhardsig/USB3/USB3versus.pdf
 USB-Implementers-Forum, Universal Serial Bus 3.0 Speciﬁcation, 1st ed., Hewlett-Packard Company, Intel Corporation, Microsoft Corporation, NEC Corporation, STEricsson, Texas Instruments, March 2011. [Online] Available: http://www.usb.org
 USB 3.0 Technology, HP Technologies, January 2014. [Online]. Available: http://h20195.www2.hp.com/V2/ GetPDF.aspx/4AA4-2724ENW.pdf
 P. Hung, T. Sakurai, and P. Fung, “Method and apparatus of usb 3.0 based computer, console and peripheral sharing,” GEN US20 110 113 166, May, 2011. [Online]. Available: https://www.google.com/ patents/US20110113166
 P. Hung and T. Sakurai, “Translation usb intermediate device and data rate apportionment usb intermediate de- vice,” GEN US20 110 022 769, 1, 2011. [Online]. Available: https://www.google.com/patents/US20110022769  USB 3.0 Background Factors, New Features, and Applicability as a Camera Interface, IDS Imaging Development Systems GmbH, 2012. [Online]. Available: www.ids-imaging.com
 USB 3.0 IP SuperSpeed Device Controller IP, Inno Logic Inc. [Online]. Available: http://inno-logic.com/usb-3-otg/
 PHY Interface For the PCI Express USB 3.0Architec- tures, Intel Corporation, 2009. [Online]. Available: http://www.intel.com.tw/content/www/tw/zh/io/pci-express/ phy-interface-pci-express-sata-usb30-architectures.html
 USB 3.0 Internal Connector and Cable Speciﬁcation, Intel Corporation, August 2010. [Online]. Available: http://www.intel.in/content/dam/doc/technical-speciﬁcation/usb3-internal-connector-cable-speciﬁcation.pdf
 Intel Universal Serial Bus (USB) Frequently Asked Questions (FAQ), Intel Corporation,2014.[Online]. Available: http://www.intel.in/content/www/in/ en/io/universal-serial-bus/universal-serial-bus-faq.html
 M. Kakish, “Motherboard compatible with multiple versions of universal serial bus (usb) and related method,” GEN US20 110 191 503, August, 2011. [Online]. Available: https://www.google.com/patents/US20110191503
 J. Lai, “User-friendly usb connector,” GEN US7 717 717, May, 2010. [Online]. Available: https://www.google.com/ patents/US7717717
 D. Luke, “Simpliﬁed universal serial bus (usb) hub ar- chitecture,” GEN US7 657 691, February, 2013. [Online].Available: https://www.google.com/patents/US7657691
 N. Murata, “Usb host controller, information processor, control method of usb host controller, and storage medium,” GEN US20 100 005 327, January, 2010. [Online]. Available: https://www.google.com/patents/ US20100005327
 P.-J. Pietri, P. Thomsen, and M. Christiansen, “Usb 3.0 support in mobile platform with usb 2.0 interface,” GEN US8 510 494, August, 2013. [Online]. Available: https://www.google.com/patents/US8510494
 R. A. Dunstan, G. A. Solomon, and J. A. Schaefer, “Universal serial bus host to host communications,” GEN US20 100 169 511, July, 2010. [Online]. Available: https://www.google.com/patents/US20100169511
 USB Overview, Silicon Labs, 2011. [On- line]. Available: http://www.silabs.com/Support% 20Documents/Software/USB Overview.pdf
 W. Szeremeta and M. Trinh, “Universal test connector for connecting a sata or usb data storage device to a data storage device tester,” GEN US8 753 146, June, 2014. [Online]. Available: https://www.google. com/patents/US8753146
 USB 3.0 FAQ Sheet, Targus. [Online]. Available: http://targus.com/hk/downloads/usb-3 faq sheet.pdf
 Data Manual – TUSB1310 USB 3.0 Transciever, Texas Instruments, December 2009. [Online]. Available: http://www.ti.com/cn/lit/pdf/sllse16
 “Usb 3.0 superspeeds,” December 2011 [Online]. Available: http://www.usr.com/en/education/ usb-30-peripherals/
 H. Van Antwerpen and H. Letourneur, “Usb port connected to multiple usb compliant devices,” GEN US8 364 870, January, 2013. [Online]. Available: https://www.google.com/patents/US8364870
 L. Wu, “Motherboard with universal series bus connector,” GEN US20 120 033 369, February, 2012. [Online]. Available: https://www.google.com/patents/ US20120033369
 A. K. R. Yamamoto and M. Tone, “Wireless communication system,” GEN US20 110 064 023, March, 2011. [Online]. Available: https://www.google.com/ patents/US20110064023
 T. Moore, “Usb 3.0 technical overview,” October 2009. [Online]. Available: http://www.mcci.com/mcci-v5/pdf/