1. Getting the most out of Software Stepping

Generating step pulses in software has one very big advantage - it’s free. Just about every PC has a parallel port that is capable of outputting step pulses that are generated by the software. However, software step pulses also have some disadvantages:

  • limited maximum step rate

  • jitter in the generated pulses

  • loads the CPU

This chapter has some steps that can help you get the best results from software generated steps.

1.1. Run a Latency Test

The CPU is not the only factor determining latency. Motherboards, graphics cards, USB ports and many other things can degrade it. The best way to know what to expect from a PC is to run the RT latency tests.

Run the latency test as described in the Latency Test chapter.

While the test is running, you should abuse the computer. Move windows around on the screen. Surf the web. Copy some large files around on the disk. Play some music. Run an OpenGL program such as glxgears. The idea is to put the PC through its paces while the latency test checks to see what the worst case numbers are.

The last number in the column labeled Max Jitter is the most important. Write it down - you will need it later. It contains the worst latency measurement during the entire run of the test. In the example above, that is 6693 nano-seconds, or 6,69 micro-seconds, which is excellent. However the example only ran for a few seconds (it prints one line every second). You should run the test for at least several minutes; sometimes the worst case latency doesn’t happen very often, or only happens when you do some particular action. I had one Intel motherboard that worked pretty well most of the time, but every 64 seconds it had a very bad 300 µs latency. Fortunately that is fixable, see Fixing SMI issues on the LinuxCNC Wiki

So, what do the results mean? If your Max Jitter number is less than about 15-20 microseconds (15000-20000 nanoseconds), the computer should give very nice results with software stepping. If the max latency is more like 30-50 microseconds, you can still get good results, but your maximum step rate might be a little disappointing, especially if you use microstepping or have very fine pitch leadscrews. If the numbers are 100 µs or more (100,000 nanoseconds), then the PC is not a good candidate for software stepping. Numbers over 1 millisecond (1,000,000 nanoseconds) mean the PC is not a good candidate for LinuxCNC, regardless of whether you use software stepping or not.

Note that if you get high numbers, there may be ways to improve them. For example, one PC had very bad latency (several milliseconds) when using the onboard video. But a $5 used video card solved the problem - LinuxCNC does not require bleeding edge hardware.

1.2. Figure out what your drives expect

Different brands of stepper drives have different timing requirements on their step and direction inputs. So you need to dig out (or Google for) the data sheet that has your drive’s specs.

From the Gecko G202 manual:

Step Frequency: 0 to 200 kHz
Step Pulse "0" Time: 0.5 µs min (Step on falling edge)
Step Pulse "1" Time: 4.5 µs min
Direction Setup: 1 µs min (20 µs min hold time after Step edge)

From the Gecko G203V manual:

Step Frequency: 0 to 333 kHz
Step Pulse "0" Time: 2.0 µs min (Step on rising edge)
Step Pulse "1" Time: 1.0 µs min

Direction Setup:
    200 ns (0.2 µs) before step pulse rising edge
    200 ns (0.2 µs) hold after step pulse rising edge

From the Xylotex datasheet:

Minimum DIR setup time before rising edge of STEP Pulse 200 ns Minimum
DIR hold time after rising edge of STEP pulse 200 ns
Minimum STEP pulse high time 2.0 µs
Minimum STEP pulse low time 1.0 µs
Step happens on rising edge

Once you find the numbers, write them down too - you need them in the next step.

1.3. Choose your BASE_PERIOD

BASE_PERIOD is the heartbeat of your LinuxCNC computer. Every period, the software step generator decides if it is time for another step pulse. A shorter period will allow you to generate more pulses per second, within limits. But if you go too short, your computer will spend so much time generating step pulses that everything else will slow to a crawl, or maybe even lock up. Latency and stepper drive requirements affect the shortest period you can use, as we will see in a minute.

Let’s look at the Gecko example first. The G202 can handle step pulses that go low for 0.5 µs and high for 4.5 µs, it needs the direction pin to be stable 1 µs before the falling edge, and remain stable for 20 µs after the falling edge. The longest timing requirement is the 20 µs hold time. A simple approach would be to set the period at 20 µs. That means that all changes on the STEP and DIR lines are separated by 20 µs. All is good, right?

Wrong! If there was ZERO latency, then all edges would be separated by 20 µs, and everything would be fine. But all computers have some latency. Latency means lateness. If the computer has 11 µs of latency, that means sometimes the software runs as much as 11 µs later than it was supposed to. If one run of the software is 11 µs late, and the next one is on time, the delay from the first to the second is only 9 µs. If the first one generated a step pulse, and the second one changed the direction bit, you just violated the 20 µs G202 hold time requirement. That means your drive might have taken a step in the wrong direction, and your part will be the wrong size.

The really nasty part about this problem is that it can be very very rare. Worst case latencies might only happen a few times a minute, and the odds of bad latency happening just as the motor is changing direction are low. So you get very rare errors that ruin a part every once in a while and are impossible to troubleshoot.

The simplest way to avoid this problem is to choose a BASE_PERIOD that is the sum of the longest timing requirement of your drive, and the worst case latency of your computer. If you are running a Gecko with a 20 µs hold time requirement, and your latency test said you have a maximum latency of 11 µs, then if you set the BASE_PERIOD to 20+11 = 31 µs (31000 nano-seconds in the ini file), you are guaranteed to meet the drive’s timing requirements.

But there is a tradeoff. Making a step pulse requires at least two periods. One to start the pulse, and one to end it. Since the period is 31 µs, it takes 2x31 = 62 µs to create a step pulse. That means the maximum step rate is only 16,129 steps per second. Not so good. (But don’t give up yet, we still have some tweaking to do in the next section.)

For the Xylotex, the setup and hold times are very short, 200 ns each (0.2 µs). The longest time is the 2 µs high time. If you have 11 µs latency, then you can set the BASE_PERIOD as low as 11+2=13 µs. Getting rid of the long 20 µs hold time really helps! With a period of 13 µs, a complete step takes 2x13 = 26 µs, and the maximum step rate is 38,461 steps per second!

But you can’t start celebrating yet. Note that 13 µs is a very short period. If you try to run the step generator every 13 µs, there might not be enough time left to run anything else, and your computer will lock up. If you are aiming for periods of less than 25 µs, you should start at 25 µs or more, run LinuxCNC, and see how things respond. If all is well, you can gradually decrease the period. If the mouse pointer starts getting sluggish, and everything else on the PC slows down, your period is a little too short. Go back to the previous value that let the computer run smoothly.

In this case, suppose you started at 25 µs, trying to get to 13 µs, but you find that around 16 µs is the limit - any less and the computer doesn’t respond very well. So you use 16 µs. With a 16 µs period and 11 µs latency, the shortest output time will be 16-11 = 5 µs. The drive only needs 2 µs, so you have some margin. Margin is good - you don’t want to lose steps because you cut the timing too close.

What is the maximum step rate? Remember, two periods to make a step. You settled on 16 µs for the period, so a step takes 32 µs. That works out to a not bad 31,250 steps per second.

1.4. Use steplen, stepspace, dirsetup, and/or dirhold

In the last section, we got the Xylotex drive to a 16 µs period and a 31,250 step per second maximum speed. But the Gecko was stuck at 31 µs and a not-so-nice 16,129 steps per second. The Xylotex example is as good as we can make it. But the Gecko can be improved.

The problem with the G202 is the 20 µs hold time requirement. That plus the 11 µs latency is what forces us to use a slow 31 µs period. But the LinuxCNC software step generator has some parameters that let you increase the various time from one period to several. For example, if steplen is changed from 1 to 2, then it there will be two periods between the beginning and end of the step pulse. Likewise, if dirhold is changed from 1 to 3, there will be at least three periods between the step pulse and a change of the direction pin.

If we can use dirhold to meet the 20 µs hold time requirement, then the next longest time is the 4.5 µs high time. Add the 11 µs latency to the 4.5 µs high time, and you get a minimum period of 15.5 µs. When you try 15.5 µs, you find that the computer is sluggish, so you settle on 16 µs. If we leave dirhold at 1 (the default), then the minimum time between step and direction is the 16 µs period minus the 11 µs latency = 5 µs, which is not enough. We need another 15 µs. Since the period is 16 µs, we need one more period. So we change dirhold from 1 to 2. Now the minimum time from the end of the step pulse to the changing direction pin is 5+16=21 µs, and we don’t have to worry about the Gecko stepping the wrong direction because of latency.

If the computer has a latency of 11 µs, then a combination of a 16 µs base period, and a dirhold value of 2 ensures that we will always meet the timing requirements of the Gecko. For normal stepping (no direction change), the increased dirhold value has no effect. It takes two periods totalling 32 µs to make each step, and we have the same 31,250 step per second rate that we got with the Xylotex.

The 11 µs latency number used in this example is very good. If you work through these examples with larger latency, like 20 or 25 µs, the top step rate for both the Xylotex and the Gecko will be lower. But the same formulas apply for calculating the optimum BASE_PERIOD, and for tweaking dirhold or other step generator parameters.

1.5. No Guessing!

For a fast AND reliable software based stepper system, you cannot just guess at periods and other configuration parameters. You need to make measurements on your computer, and do the math to ensure that your drives get the signals they need.

To make the math easier, I’ve created an Open Office spreadsheet Step Timing Calculator. You enter your latency test result and your stepper drive timing requirements and the spreadsheet calculates the optimum BASE_PERIOD. Next, you test the period to make sure it won’t slow down or lock up your PC. Finally, you enter the actual period, and the spreadsheet will tell you the stepgen parameter settings that are needed to meet your drive’s timing requirements. It also calculates the maximum step rate that you will be able to generate.

I’ve added a few things to the spreadsheet to calculate max speed and stepper electrical calculations.