UDMA SD CARD INTERFACE
The SDIO (Secure digital I/O) card provides high speed data I/O with low power consumption for mobile electronic devices. Host devices supporting SDIO can connect the SD slot with I/O devices like Bluetooth, wireless, LAN, GPS Receiver, Digit Camera etc.
SDIO INTERFACE BUS:
FEATURES:
It has a clock, command and 4-bit data bus.
Supports quad mode.
Five types of response supported:
No response
48 bits with CRC
48 bits with NO CRC.
136 bits with BUSY check.
Error output pin.
End of transfer output pin.
Four error status for command transfer supported:
No error
Response timeout
Response wrong direction
Response busy timeout
THEORY OF OPERATION:
Communication over the SD bus is based on command and data bit streams that are initiated by a start and terminated by a stop bit.
- Command: Operation is started by generating a command. A host can
send a command to either a single card or to all the connected cards. A command is transferred serially on the CMD line. MSB is transmitted first and LSB is transmitted last. Transmission bit is 1 as command is transmitted from host to device.
Command token format|Command token|
Bit position |
47 |
46 |
[45:40] |
[39:8] |
[ 7: 1] |
0 |
---|---|---|---|---|---|---|
Width |
1 |
1 |
6 |
32 |
7 |
1 |
Value |
0 |
1 |
x |
x |
x |
1 |
D escription |
S tart bit |
Transmission bit |
cmd opcode |
Cmd arg |
c rc |
End bit |
- Response: A response is sent from an addressed card to the host as an
answer to a previously received command. A response is transferred serially on the CMD line. Transmission bit is 0 as response is transmitted from device to host.
Response token format:
Bit position |
47 |
46 |
[45:40] |
[39:8] |
[ 7: 1] |
0 |
---|---|---|---|---|---|---|
Width |
1 |
1 |
6 |
32 |
7 |
1 |
Value |
0 |
0 |
x |
x |
x |
1 |
D escription |
S tart bit |
T ransmission bit |
Command index |
Card status |
c rc |
End bit |
Bit position |
135 |
134 |
[133:128] |
[127:1] |
0 |
---|---|---|---|---|---|
Width |
1 |
1 |
6 |
127 |
1 |
Value |
0 |
0 |
111111 |
x |
1 |
D escription |
S tart bit |
T ransmission bit |
Reserved |
Response content including CRC |
End bit |
- Data: Data can be transferred from the card to the host or vice
versa. It is transferred via the data lines.
Data transfer to/from the SD memory card is done in blocks. Data blocks are succeeded by CRC bits.
BLOCK DIAGRAM:
It contains a reg interface for writing and reading from the register and a SDIO TX/RX module which instantiates command and data modules. Command module handles command interface and data module handles data transfer.
SDIO TX/RX:
This module is responsible for sending and receiving command and data between host and device. It instantiates command and data modules.
It uses the clock generated by udma_clockgen as input clock.
- Synchronous start generated by the edge propagator is used to get the
command start bit. This command start bit is sent to the command module which marks the start of the command.
This module works in three states:
CMD_ONLY: This is the default state.
- State is set to WAIT_EOT if there is no block to be transmitted
else state is set to WAIT_LAST.
WAIT_LAST: Wait for the last piece of data to be transferred.
- After transferring the last piece of data we go to state
WAIT_EOT.
- Data module sends a high signal which indicates transfer of
last data.
WAIT_EOT: Wait for the end of transaction.
- If command eot and data eot sent by command and data module
respectively are high then go to state CMD_ONLY and reset command eot and data eot.
- Status16 bit status output is transmitted through this block. This
status can be read through SDIO_REG_STATUS. Non-negative status would generate and error.
Bit position |
[15:14] |
[13:8] |
[7:6] |
[5:0] |
---|---|---|---|---|
value |
00 |
x |
00 |
x |
Description |
reserved |
Data status |
reserved |
Command status |
It instantiates two sub blocks: command and data.
Command block: This module handles the command interface.
It supports three types of response status:
Response timeout
Response wrong direction
Response busy timeout
- It supports five types of responses: Response type is written to
register
SDIO_REG_CMD_OP.
Null response: No response.
48 bits with CRC.
Supports CRC.
Response length is 38 bits.
48 bits with NO CRC
Supports CRC.
Response length is 38 bits.
136 bits
Supports CRC.
Response length is 133 bits.
48 bits with a busy check.
Supports CRC.
Response length is 38 bits.
Supports busy signal.
- Command output enable signal: sdcmd_oen_o is an active low signal. It
is enabled during transfer of command and is disabled during reception of response.
It goes through twelve states:
ST_IDLE: Default state when the system is IDLE.
Clock is disabled initially.
- If the command start bit is high then state is set to
ST_TX_START and clock is enabled.
ST_TX_START: Send the start bit to start the transaction.
Start bit is sent through sdcmd_o.
State is set to ST_TX_DIR.
ST_TX_DIR: Set the transmission bit of the command.
Transmission bit is sent through sdcmd_o.
CRC is enabled.
- State is set to ST_TX_SHIFT and 38 bit command data is
transmitted.
ST_TX_SHIFT:
MSB of the command is sent as output through sdcmd_o.
CRC is enabled.
Command data is shifted to the left by 1 bit.
- If the 38 bit command data is transmitted go to state ST_TX_CRC
and start counting the CRC bits. There are 7 CRC bits.
ST_TX_CRC: Send CRC output and shifts CRC.
CRC output is sent as output through sdcmd_o.
CRC is enabled.
- State is set to ST_TX_STOP after successfully transmitting the
CRC bits.
ST_TX_STOP: Transmit the stop bit of the command.
Stop bit is sent through sdcmd_o.
- Start read is enabled which indicate we can read from data
block.
If the response is enabled, set the state to ST_RX_START.
If the response is disabled, go to state ST_WAIT.
ST_RX_START: Initiates the reception of response.
Response is received via sdcmd_i
State is set to ST_RX_DIR if the start bit is received.
- If the start bit is not received till 38 clock cycle then
response status is set to response timeout.
- ST_RX_DIR: Check if the received command indicates the correct
direction.
Direction bit is received and state is set ST_RX_SHIFT.
Response data is received.
- Receiving an incorrect bit would set the response status to
response wrong direction and the state is set to ST_IDLE.
ST_RX_SHIFT: Shift in response data.
CRC is calculated.
- If the response data is received and response crc is enabled
then state is set to ST_TX_CRC.
- If response data is received and response crc is disabled and
response busy is enabled then go to state ST_WAIT_BUSY.
- If response count is completed but response crc and response
busy are not enabled then go to ST_WAIT.
ST_RX_CRC:
If CRC is received then go to ST_WAIT..
ST_WAIT_BUSY:
- If a low busy signal is received from the data block then we go
to ST_WAIT.
- If the busy signal is high till 256 clock cycle status is set
to response busy timeout.
ST_WAIT:
- After waiting for 8 clock cycles, high output is asserted
through command eot output and high start write output is sent which indicates successful write command.
Set the state to ST_IDLE.
- Single bit is transferred at every posedge of the clock. Transmitting
a 38 bit command data would take 38 clock cycles and a 7 bit crc would take 7 clock cycles. Similarly receiving a response would take response length clock cycle.
- Response content is received through sdcmd_i and is sent as output
through rsp_data_o.
Data block: This module is responsible for handling data transfer.
Support response status timeout.
Supports five types of responses:
Null response: No response.
48 bits with CRC.
Supports CRC.
Response length is 38 bits.
48 bits with NO CRC
Supports CRC.
Response length is 38 bits.
136 bits
Supports CRC.
Response length is 133 bits.
48 bits with a busy check.
Supports CRC.
Response length is 38 bits.
Supports busy signal.
Supports 16 bit CRC.
- Data output enable signal: sddata_oen_o is an active low signal.
It is enabled during transfer of data and is disabled during reception of data.
Data can be transmitted in 2 modes:
- Single count mode: Data is transferred only on DATA[0] pin. LSB
is transmitted first and MSB is transmitted last.
Quad mode: Data is transferred on all the four data pins.
States:
ST_IDLE:
For read operation go to state ST_RX_START.
For write operation go to ST_TX_START.
Read and write operation is decided by data_rwn_i.
ST_TX_START:
Send the start bit through sddata_o to start the transaction.
Go to state ST_TX_SHIFT.
one block is transmitted.
ST_TX_SHIFT:
Data output is enabled.
CRC is calculated.
Direction bit is sent.
If the whole block is transmitted go to state ST_TX_CRC.
ST_TX_CRC:
Output crc through sddata_o.
state is set to ST_TX_END.
ST_TX_END
Send ‘F’ through sddata_o.
Go to state ST_TX_CRCSTAT.
ST_TX_CRCSTAT
Wait for 8 clock cycles and go to state ST_TX_BUSY
ST_TX_BUSY
After waiting for 512 cycles go to timeout phase:
Sdio timeout counter is increased till it reaches 1023.
- State is set to ST_IDLE and high output is driven through
eot output.
- If 512 cycles is not reached and high bit is received in LSB of
incoming data:
- If all the blocks are transmitted then the state is set to
ST_IDLE and eot is asserted.
- If the whole block is not transmitted then go to ST_TX_START
and transmit the next block.
ST_RX_START:
Data is received through sddata_i.
Go to state ST_RX_SHIFT if the start bit is received.
- If the start bit is not received, set the status to response
timeout and go to ST_IDLE.
ST_RX_SHIFT:
CRC is calculated.
After receiving the block state is set to ST_RX_CRC.
ST_RX_CRC:
CRC is calculated.
- After receiving the 16 bit CRC, if the all the blocks are
received assert eot else go to ST_RX_START and receive another block.
ST_WAIT
- After waiting for 8 clock cycle assert eot and go to state
ST_IDLE.
- This block contains a data input, data output, ready output and valid
output.
- Data input: Receive the data from tx fifo and transfer it to
device.
Data output: Transmits the data received from device to rx fifo.
Ready output: Indicate if block is ready to write data to tx fifo.
- Valid output: Indicate if there is valid data to be written to rx
fifo.
(u_clockgen) udma_clkgen:-
Ports:-
input logic clk_i,
input logic rstn_i,
input logic dft_test_mode_i,
input logic dft_cg_enable_i,
input logic clock_enable_i,
input logic [7:0] clk_div_data_i,
input logic clk_div_valid_i,
output logic clk_div_ack_o,
output logic clk_o
Theory of operation :-
This is a Integer clock divider with async configuration interface.
The module will be in four modes namely IDLE, STOP, WAIT, RELEASE.
The clock_enable_i should be high and the module should enable the clock for the output to be visible.
- When the module is in reset mode by making rstn_i low,then the mode
is set to IDLE.The multiplexer will select the input clock to be the output clock of the module.The clock divider is enabled.The clock is enabled by the module.
- The signal clk_div_valid_i is sent to a pulp_sync_wedge module as
serial_i and output of the module is clk_div_valid_sync which represents the r_edge_o of the pulp_sync_edge.
- Now at every positive edge of the clk_i,If clk_div_valid_sync is
high,then the clk_div_data_i is read and the clock divider value is updated.Also ,the next state of the module is change to STOP state, so that in next clock cycle as state is in STOP mode, the multiplexer sets the output to the clock divider output and schedules the update of clock divider config to next clock cycle by changing the next state to WAIT.The clock is disabled in this state to make the config changes.
- At the next clock edge as it is in WAIT state,here the clock divider config is updated ,which means the counter value is set
to default 0 and output clock is made low.The next state is RELEASE state.
- In the next clock edge where the state is RELEASE,from this moment
the clock divider starts working with a new clock divider value from the start(as counter i made 0).The next state is again back to IDLE state.
- In the next clock edge(IDLE state),The clock is enabled so that
we can see the output in the output pin.The module will remain in this state and clock divider will be toggle the output clock signal after the counter reaches a value equal to (clock divisor value -1) and at this moment the counter also becomes 0 so that it can be incremented by one unit again at every clock edge.This will continue until again clk_div_valid_sync become high again then The clock divider value is updated and the state goes to STOP mode for next clock cycle to reset the clock divisor and counter to 0.
The clk_o is nothing but the output signal of the clock divider.
Pulp_sync_wedge:-
Ports:-
input logic clk_i,
input logic rstn_i,
input logic en_i,
input logic serial_i,
output logic r_edge_o,
output logic f_edge_o,
output logic serial_o
The module takes an input(serial_i) and a clock signal.There is a submodule which contains a 2 bit shift register which will be storing the signal value at every clock edge by shifting to right and storing new signal value in MSB.The output of the shift register is the LSB which is connected in cascaded fashion to a module which writes the output serial_o .At every clock positive edge , the serial_o is updated with the current LSB of the shift register and LSB is updated by shift register with new value ,both happening in a parallel non blocking fashion.Whenever the LSB of the shift register changes to high and serial_o was low,then r_edge_o is made high ,but at next clock cycle as serial_o is updated to LSB of shift register ,r_edge_o goes low (So r_edge_o stays high for only one clock cycle).
Edge Propagator:-
Ports:-
input logic clk_tx_i,
input logic rstn_tx_i,
input logic edge_i,
input logic clk_rx_i,
input logic rstn_rx_i,
output logic edge_o
Theory of operation:-
- The main purpose of the module is to propagate the input value in
edge_i for a period of time.
So,when rstn_tx_i is low, then the output is low.
- In active mode ,Whenever the edge_i is high it is stored in a
register and reflected in a signal in the next positive edge of clk_tx_i.The signal which is sensitive to edge_i at every positive edge of clk_tx_i is sent to pulp_sync_wedge.This signal will remain high until we get an output from pulp_sync_wedge(serial_o)( clock clk_rx_i ) for the signal sensitive to edge_i.Then after three clock cycles after we get a output from pulp_sync_wedge we will make the signal low .After this the signal is again set sensitive to edge_i .So everything repeats after the edge_i is triggered again.Now the output of the edge propagator (edge_o) is nothing but the rising edge pulp_sync_wedge based signal,i.e.,r_edge_o, which asserts one clock cycle after the signal sensitive to edge_i is made high.
udma_dc_fifo:
Ports:-
input logic src_clk_i,
input logic src_rstn_i,
input logic [DATA_WIDTH-1:0] src_data_i,
input logic src_valid_i,
output logic src_ready_o,
input logic dst_clk_i,
input logic dst_rstn_i,
output logic [DATA_WIDTH-1:0] dst_data_o,
output logic dst_valid_o,
input logic dst_ready_i
Theory of operation:-
- The module contains two sub modules ,one connected to the source to
receive the data and enter the data into FIFO and another module connected to the destination and works on sending the FIFO data to destination.
din:
This is connected to the source.
input clk;
input rstn;
input [DATA_WIDTH - 1 : 0] data;
input valid;
output ready;
output [BUFFER_DEPTH - 1 : 0] write_token;
input [BUFFER_DEPTH - 1 : 0] read_pointer;
output [DATA_WIDTH - 1 : 0] data_async;
- The din contains the dc_buffer module which contains the actual
FIFO , dc_token_ring module which contains the logic to compute the write pointers and finally the dc_full_detector module which contains the logic to deduce whether FIFO is full or not.
dout:
input [DATA_WIDTH - 1 : 0] data_async;
input clk;
input rstn;
output [DATA_WIDTH - 1 : 0] data;
output valid;
input ready;
input [BUFFER_DEPTH - 1 : 0] write_token;
output [BUFFER_DEPTH - 1 : 0] read_pointer;
- dout contains the dc_token_ring module which contains the logic to
compute the read_pointer to read data from when FIFO is not empty .
- It will also be taking write_token from the din module and perform
the logic of dc_synchronizer on it to get the synchronized version of write_token.This synchronized write_token along with read_pointer is used to deduce whether the FIFO is empty or not ,using bit manipulation.
- The dc_synchronizer synchronizes the write_token with respect to
clk ,which means at every clock edge the value of the write_token is stored but it is reflected in the output in the next clock edge.
If the FIFO is empty then we cannot read from the FIFO.
io_tx_fifo:-
This is a TX FIFO with outstanding request support.
Ports:-
input logic clk_i,
input logic rstn_i,
input logic clr_i,
output logic req_o,
input logic gnt_i,
output logic [DATA_WIDTH-1 : 0] data_o,
output logic valid_o,
input logic ready_i,
input logic valid_i,
input logic [DATA_WIDTH-1 : 0] data_i,
output logic ready_o
Theory of operation:-
- The module contains a FIFO module which is a synchronous FIFO with
configurable width and depth and logic to keep record of any outstanding requests and if requests are solved and accordingly update the record.
- The FIFO module does three things which are updating the buffer ,
keeping record of the number of elements in the buffer and updating the write and read pointers in the buffer.
- If the signal valid_i is high and the buffer is not full then at
the current position at the write pointer of the buffer array the data from data_i is stored and the write pointer is incremented.
- If the signal ready_i is high and the buffer is not empty then the
read pointer is incremented.
- The current read pointer position in the buffer array is read into
data_o always.The read pointer is updated at every clock positive edge as mentioned in above bullet point.
- While above functions are happening at every clock positive edge
the number of elements in the buffer is also updated and recorded after the operation.
- Coming to the logic for the outstanding requests,We calculate the
number of free spaces left in the buffer and if it is equal to the number of the outstanding requests then we signal to stop further requests.If there is no signal to stop the requests and the buffer is ready to accept requests (which means it is not full) then we can accept the requests.
- So at every clock positive edge if the module and accept requests
based on above conditions and the request is granted by making gnt_i signal high then if valid_i is low(meaning the request is not valid) or The buffer is not ready (meaning the buffer is full) then the number of outstanding requests will be incremented.If valid_i is high and the buffer is ready then the number of outstanding requests will be decremented.
Whole operation:-
- As we can see in the above block diagram,some of the input signals go
into the register interface and few signals are generated by the register interface which are used for various operations in SDIO.
- We have a module called u_clockgen which takes in a few parameters
from the register and generates a sdio clock from the peripheral clock (u_clockgen is a Integer clock divider with async configuration interface).
- The SDIO clock generated will be used in the SDIO module.There is a
edge propagator module which takes in s_start from register which is in sync with system clock sys_clk_i and sdio clock as input ,finally generates the resign edge of the s_start in sync with the sdio clock ,So this module generates resign edge of s_start in sync with the sdio clock instead of sys_clk_i.
- The s_start_sync ,sdio clock and signals from registers go to
sdio_tx_rx where actual logic for transmission and receiving is executed.sdio_tx_rx runs in sync with sdio clock.
- The signal s_err from the module sdio_tx_rx is synchronized to
sys_clk_i using the module pulp_sync_wedge and set to register to be stored.
- The s_eot signal from sdio_tx__rx is in sync with sdio clock ,there
is edge propagator with name i_eot_sync which generates the rising edge of the signal s_eot in sync with sys_clk_i and the synchronized signal is sent to register to be stored.
- For communication between( SDIO and uDMA )and (SDIO and I/O) we use
FIFOs.
FIFOs for SDIO and uDMA:-
- There are three FIFOs in total , i_dc_fifo_tx , u_dc_fifo_rx and
io_tx_fifo.
- u_dc_fifo_rx is a udma_dc_fifo which is used to transfer
the data from the clock domain of sdio clock domain(source) and sys_clk_i (system clock)(destination).So the data from peripheral I/O gets into sdio_tx_rx module, to communicate this received data from the sdio_tx_rx to uDMA we use this FIFO u_dc_fifo_rx.
- i_dc_fifo_tx is different clock FIFO between sys_ck_i(source)
and sdio clock(destination) .So basically The data which comes as input to the top module from uDMA is sent first to io_tx_fifo which is a TX FIFO with outstanding request support.The data in this FIFO goes to i_dc_fifo_tx which is FIFO between sys_clk_i as source and sdio clock as destination ,So through this FIFO named i_dc_fifo_tx ,the data can be sent to the sdio_tx_rx module in SDIO.So the data from the uDMA first gets into the FIFO named io_tx_fifo and then from there into a FIFO named i_dc_fifo_tx and finally to sdio_tx_rx.