Tuesday, 22 September 2009

matrices - Matrix Factorization Model

I'll give an example from politics. Let's say you have a legislative body, such as the House of Representatives in the U. S. Congress. Over a period of time, the members of the House will vote on many bills and thereby accrue a voting history. Let's encode votes as numbers: a vote for a bill is 1, a vote against a bill is -1, and an abstention (no vote) is 0. Also, let's label the representatives R1,ldots,Rm, and label the bills B1,ldots,Bn. We thus have, for each pair i,j of numbers with 0<ileqm, 0<jleqn, a vote Vijin1,0,1, namely how congressperson Ri voted on bill Bj. This gives us a big mtimesn "vote matrix" V whose entries are the votes Vij.



Now, it's true that each congressperson has an opinion on every bill, or dually, that each bill appeals to different congresspeople in different amounts (possibly negative). However, that description misses an important aspect of the situation: a congressperson's voting behavior can be approximated using much fewer parameters than a list of all his votes. Also, a bill's tendency to appeal to different people can be approximated using much fewer parameters than a list of all the people who voted for and against it. Indeed, it's not really bills that congresspeople have opinions on; it's issues and policies. On the flip side, a given bill will implement various types of policies and address various issues, and that's really what determines who will like it and how much.



Thus, to describe voting behaviors, what you really need is (1) a list of policies P1,ldots,Pf (the latent factors), (2) for each congressperson Ri, a degree of preference Sik for each policy Pk, and (3) for each bill Bj, a value Ckj describing to what extent it implements policy Pk. Let's continue the convention that positive values for Sik or Ckj indicate accordance and negative values indicate opposition. To each congressperson Ri, we can assign the vector Si=(Si1,ldots,Sif) (which we might call the "policy vector" for that congressperson), and to each bill Bj, we can assign the vector Cj=(C1j,ldots,Cfj) (which we might also call the "policy vector" for that bill).



For what follows, I'm going to use a very simplistic (but somewhat plausible) mathematical model. (Your article uses a more complicated and more realistic model, taking bias into account, for example.) Also, the model makes a lot more sense if congresspeople are asked to state their degree of preference for each bill rather than simply voting "yes" or "no," so that the "votes" Vik take values in mathbbR. When deciding how to vote on a bill, a congressperson may consider how well it correlates with her opinions and make a decision based on that. With a lot of vigorous hand waving and wishful thinking, the outcome of this process can be described very simply in terms of policy vectors: the vote Vij is simply the dot product SicdotCj=sumkSikCkj. (I'll leave it as an exercise to show that's not completely ridiculous, even if unlikely to be exactly true.) Another way of saying this is that if S is the matrix with entries Sik and C is the matrix with entries Ckj, then V=SC. In other words, our knowledge about legislative policies as latent factors induces a factorization of the matrix V.



One merit of the above approach is that although it's a bit too simple, it leads to well understood mathematics. For it to be useful, the number of policies f should be much smaller than m and n, the numbers of congresspeople and bills. In that case, the factorization V=SC means that V has rank f, which is small compared to its dimensions. In practice, admitting that the description in terms of policies as latent factors can only be a good approximation, not exact, this means that V is well approximated by a low-rank matrix. Factorizations can be obtained from that observation alone using standard matrix tools like the singular-value decomposition. In particular, the policy vectors can be found even before you have any idea what the "policies" should be. (In other words, you don't have to sit down and make a list of policies you think are important and figure out what the policy vectors must be from that; you can use a standard algorithm which will determine the policies for you. Of course, it won't name the policies, but if you need to, you can compare policy vectors to figure out what real-world policies the algorithmically extracted policies approximate.)

No comments:

Post a Comment