OK, it seems like your $N=52$ then and the normals to the planes are fixed. Then you are right that $N^3$ method takes 1ms time. Indeed, to find the parametric coefficients of the intersection of any 2 planes once you prestore the line directions and the inverse matrices takes just 6 multiplications and 3 additions, to find the cut takes 52 cycles of 6 multiplications, 5 additions, one division, and two comparisons each (keeping the max and the min), and doing all 1326 pairs that requires 1326*(9+52*14)=977262 elementary operations, which puts your processor speed at 10^9 multiplications per second, which is about right. There is an $O(N^2)$ algorithm if you know a point inside the polytope, which, once you are talking about kDop, you, probably do, but we need to count operations very carefully to see if it beats the $N^3$ one.
The key problem is that in the worst case when every plane matters, you may have about 3*52=156 edges which is just one tenth of the maximal possible amount and if you spend the same 9+52*14 operations on each edge, you will be just on the edge of what you need. Still, it is worth trying. It works under the assumption that no 4 planes intersect at one point. To make sure that it is the case, just perturb your Mins and Maxes by some random numbers that are much bigger than the precision with which you can compute the points but much less than what is visible to the eye (say, $10^{-4}$ times the object size, also make sure that you decrease Min's and increase Max's) but do all comparisons with much higher precision (like $10^{-12}$ times the object size) where the object size is just the maximal span in your directions, which you can find in linear time.
Start with finding one edge. To this end, just take any point inside the polytope, compute the vertices of the dual polytope with respect to that point, choose any direction, and take the furthest vertex of the dual polytope from the origin. Now, you know that this plane certainly matters. It will take you just linear time, so I'll not even count that. Once you figured out one plane, you can find an edge on that plane in the time 50*(9+52*14)=36850 elementary operations just as in the $N^3$ method. Keep track on which planes give you the boundary points.
Now, at each moment, keep track of edges you already have, and the vertices you have with their planes. For each vertex also keep the track of whether you know each of the adjacent edges (according to the pair of planes) completely. In the beginning, you have the list of 2 vertices with one known and two unknown edges.
Take the first vertex in the list with at least one unknown adjacent edge. You know the pair of planes for this edge, so you can find the new edge in time (9+26*15)=737, where you have 26 instead of 52 because you can find whether you'll need Max or Min in each direction in one comparison. Once you have found the edge, see if its other end is one of the already found vertices and mark the corresponding edge at that vertex as known (about 156*2=312 operations). If not, add its other vertex to the end of the list. This gives you 1049 operations per edge.
Thus, your total is 36850+156*1049=200494 operations, which is 5 times better than my count for the $N^3$ algorithm but still a bit slower than you need.
There is a room for improvement. The most obvious one is that you may want to keep the list of already found vertices for each pair of planes (the update costs 3 operations and reduces 312 to just 6. Thus, you are down to 36850+156*153=154318, which is within the limits, but just barely. Another thing is that you can try to reduce the time you need to find the first edge. but I do not see how to do it at the moment.
All right, first thing first. The main part of the cure is to construct the admissibility table that, for each pair of directions, lists all directions that may form a vertex with the given two. Surprisingly, it is very short. I'll keep the directions in the 0,$pm 1$ format like you (not normalized). I assume that you keep an array of these directions p of length 26. The only thing I want you to make sure about the order is that p[2k] and p[2k+1] are opposite for each k=0,1,...,12. This will spare a few operations in the edge tracing algorithm.
First, let us exclude all degenerate pairs. That is pretty easy. The pair (A,B) is degenerate iff A+B is a multiple of some other direction in the list. In practical terms, it means that all non-zero entries of $A+B$ are equal in absolute value. The first thing
is to create the table of pairs of directions and precompute some information for them.
To make it working for both algorithms, I suggest that for each pair (i,j) of indices, you create the structure Q[i,j] with the following fields
Boolean nondegeneracy; true if the pair is non-degenerate, false if it is degenerate.
You should get 288 non-degenerate pairs if the order of i,j matters (144 with i
vector normal; normal=$p[i]times p[j]$ is the direction orthogonal to both p[i] and p[j]
iprojector, jprojector; those are just found as $v=p[j]timesoperatorname{normal}$; iprojector$=v/(vcdot p[i])$ and similarly for jprojector. This will allow you to find a point X with given scalar products $a=p[i]cdot X$ and $b=p[j]cdot X$ as a times iprojector plus b times jprojector (6 multiplications and 3 additions)
an array goodindices of structures. Each structure element goodindices[m] will have 3 fields: integer k such that p[i],p[j],p[k] may make a vertex, and 3 scalar products iscal=iprojector$cdot$p[k],jscal=jprojector$cdot$p[k],and nscal=normal$cdot$p[k]. To construct it, you take your pair a=p[i],b=p[j] of directions and include the structure with the first field k (once you know k, you can compute the other three fields) into goodindices if the following two conditions hold:
1)the determinant det=det(a,b,c) is not 0
2) for every index m different from i,j,k at least one of the determinants det(a,b,d), det(a,d,c), and det(d,b,c) where d=p[m] has sign opposite to the sine of det, i.e., if, say, det > 0, one of them is strictly less than 0 (since all coordinates are integer, compare to -0.5). This checks that there is no direction in the cone formed by a,b,c.
Funnily enough, the total number of indices in goodindices is never greater than 6.
Also, let Max[i] be the maximal scalar product of your object points with p[i]
(I prefer to keep all 26 directions and Max array to keeping 13 and Min and Max arrays just to avoid any casework anywhere)
Now, when the table is ready, it is very easy to run the N^3 algorithm. For each pair i,j with i < j do the following.
If Q[i,j].nondegeneracy=false, just skip the pair altogether
Determine tmin as the maximum over all structures S in Q[i,j].goodindices such that S.nscal<0 of the ratios (M[S.k]-M[i]S.iscal-M[j]S.jscal)/S.nscal .
Determine tmax as the minimum over all structures S in Q[i,j].goodindices such that S.nscal>0 of the ratios (M[S.k]-M[i]S.iscal-M[j]S.jscal)/S.nscal.
If tmin < tmax, put X=M[i]Q[i,j].iprojector+M[j]Q[i,j].jprojector.; compute the points X+tmin Q[i,j].normal and X+tmax Q[i,j].normal and draw the edge between them (or store it for later drawing).
You should optimize here updating tmin and tmax when looking at each k and stopping immediately if tmax gets less than or equal to tmin. This cycle over k is the most time-consuming part when defining the edge, so how you write it determines the real speed of this algorithm. Optimize as much as possible here to make sure that you do not do any extraction from array twice, etc. Each little extra operation counts and memory is not a problem. That's why I suggested to prestore the scalar products as well, though it might seem a bit silly when we are talking of cycle of length 6 within a cycle of length 144
That's all. It should run fast enough to be acceptable already and it is robust in the sense that you do not need to perturb your maxes, check for paralellness anywhere, etc., etc.
The other algorithm is still about 4 times faster but I should check that it has no stability issues before posting it. Try this first and tell me the running time.