Style: APA

#### Cited page

Citations are available only to our active members. Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

Display options
Reset

# A Classroom Project on Protecting Social Security Numbers from Identity Theft

By: Zhu, Katherine; Yang, Yajun | Mathematics and Computer Education, Fall 2009 | Article details

# A Classroom Project on Protecting Social Security Numbers from Identity Theft

Zhu, Katherine, Yang, Yajun, Mathematics and Computer Education

(ProQuest: ... denotes formulae omitted.)

INTRODUCTION

According to security experts, identity theft is one of the fastest growing crimes in America. Social Security Numbers (SSNs) can be used [5] to help assume the identity of other individuals and commit fraud. With an estimated 10 million individuals being victimized by identity theft each year, preventing SSNs from being stolen has become increasingly essential to help protect individuals [2].

The following presentation can work well in the classroom (including a project for a Discrete Matheamtics course) to illustrate an interesting method to protect personal information and prevent identity theft. Students can encrypt SSNs in order to generate other unique numbers that can be used for non-social security purposes, such as student identification cards, health insurance accounts, and medical records. This method can be computer implemented and will: encrypt an SSN into an Identification Number (ID), and then recover the SSN from the ID number. Therefore, there would be no need to store the SSN in the computer system. To illustrate the basic idea of this paper, we will use Student IDs (SIDs) as an example of a potential application of this method. The same basic method can also be applied to other uses that involve identification numbers.

ENCRYPTING THE SSN

In cryptography, many encryption methods already exist. For our research, we will specifically look at ciphers, which are general systems for hiding the meaning of a message by replacing each letter or number in the original message with another. Some of the most well-known ciphers are the Caesar cipher, Vigenère cipher, affine ciphers, polyalphabetic ciphers, substitution ciphers, permutation ciphers, and block ciphers [1, 3, 4, and 7]. We will use block ciphers to encrypt SSNs into student IDs (SIDs). As the name implies, the block cipher is a cipher that encrypts a string or block of numbers into another block of numbers. We will use matrix multiplication to convert SSNs to SIDs block by block. This cipher is not only simple to apply, but is also relatively secure.

To encrypt an SSN using a block cipher, we first break the 9-digit SSN into 3 blocks of 3-digit numbers. For example, an SSN 123-34-4500 is grouped into (1 2 3), (3 4 4), and (5 0 0). Then each block is turned into a 3x1 matrix: ..., ... and ... We then use a 3x3 matrix A, such as ..., to encrypt each block. This matrix A is known as a key in cryptology. We apply the matrix A to each block in matrix form by matrix multiplication.

Or, equivalently, we can complete the encryption of all three blocks by performing the following computation

The above calculations assign a new number to each digit of the SSN,

By putting these new numbers together, we generate the encrypted number

1. Will two SSNs generate the same encrypted number?

2. Can the original SSN be recovered from its encrypted number?

Suppose mat there is another SSN abc-de-fghi that generates the same encrypted number using the matrix A. We have

We know that if the matrix A is invertible (in this case the matrix A is invertible), applying the inverse A~l to both sides of the equation

from the left, we obtain that

Therefore, two SSNs with the same corresponding encrypted number are identical under the condition that the matrix A is invertible. So, the answer to Question 1 is that no two SSNs will generate the same encrypted SID as long as the key is an invertible matrix.

The condition that the matrix A is invertible is also essential to the answer of the second question. Recall that the encryption is described by the equation ... we multiply both sides of the equation by A^sup -1^ from the left and obtain ... Thus, we can recover the SSN 123-34-4500 under the condition that A is invertible. So, under the same condition, i.e., A is invertible, the original SSN can be recovered from its encrypted number.

Still, there is a problem with the encrypted number. Notice that the matrix A changes each digit of the SSN into a new number that may not always be a single digit number. This will cause the length of the new SIDs to vary. For our example, the encrypted number contains 16 digits; however, other SIDs may be longer or shorter. If we allow this variation, it may cause some confusion. It also adds difficulty to break a string of digits of the SID, say 2593638235510255 into nine numbers, each corresponding to a digit in the SSN when we retrieve the SSN. To maintain the same number of digits for our SIDs, we will only keep the units digit, as shown below.

The resulting number is 596-835-055 which will be used as an SID. However, can we still guarantee that two SSNs won't generate the same SID and that it is possible to recover the original SSN from the SID?

The answers to both questions are yes, provided that certain conditions are met. To search for the necessary and sufficient conditions, we first need to understand mathematically what we did when we only kept the units digits. The units digit of any number is actually the remainder of the number after it is divided by 10. Mathematicians use the modular function 25modl0 = 5 to describe that 5 is the remainder of 25 after dividing it by 10. In order to continue, we need to gain some basic understanding of modular arithmetic. First, when an integer is divided by 10, the remainder must be one of the ten integers: 0, 1, 2, ..., 9. Collectively they form a set 0, 1, 2, ..., 9} known as the set of integers mod 10. Secondly, the modular function can be viewed as the relationship between an integer and a number in the set of integers mod 10 such that their difference is a multiple of 10. For instance, we see in the example 25modl0 = 5 that 25-5 = 2 10. Furthermore, on the set 0, 1, 2, ..., 9}, the modular function gives a new useful tool for defining a new addition T and multiplication ® . Define T and ® on 0, 1,2, ..., 9} by

a b = (a + b) mod10

a b = (a x b) mod10

called addition modulo 10, and multiplication modulo 10, respectively.

The addition and multiplication tables are given below.

Just as the inverse of matrices is important in matrix algebra, the multiplicative inverse is equally important in modular arithmetic. For a given b in 0, 1, 2, ..., 9}, if there is another number c in {0, 1, 2, ..., 9} such that b c = 1 , then c is called an inverse of b modulo 10 and denoted by b^sup -1^ mod 10. From the above multiplication ® table, we observe that

1^sup -1^ mod10 = I, 3^sup -1^mod10 = 7, 7^sup -1^mod10 = 3, 9^sup -1^mod10 = 9 and all other numbers 0, 2, 4, 5, 6, and 8 are not invertible. Combining with the following fact,

...

we observe that an integer a in {0, 1, 2, ..., 9} has an inverse modulo 10 if and only if gcd(a,10) = 1 .

For the convenience of dealing with problems that arise from the block cipher techniques, we extend the modular function for any negative integer. Let p be a negative integer. Then p mod 10 = (p + 10g) mod10 for some positive integer q such that p + 10q is a positive integer because p-(p + 10q) is a multiple of 10. We also extend the operations and to be defined on all integers, and the concept of an inverse modulo 10 of a for any integer a. We have the following facts: for any integers a and b,

1) a b = a mod 10 + b mod10

2) a b = a mod10 × b mod10

3.) a has an inverse modulo 10 if and only if gcd(a,10) = 1.

Moreover, a^sup -1^ mod10 = (a mod 10)^sup -1^ mod 10.

We redefine matrix multiplication as matrix multiplication modulo 10, ... by A ... mod10, where the resulting matrix is the one with each entry mod 10. Alternatively, we can perform ... by calculating, A ... Therefore, the entries of the matrix A are chosen from the set of integers mod 10. The operation ® is essentially the composition of the matrix multiplication with the modular function. For ...,

....

Thus, we generate the SID 596-835-055 by using a single operation of matrix multiplication .... By a similar argument, we can decrypt the SSN only if the matrix A has a multiplicative ... inverse, denoted by A^sup -1^^sub *^, that is, ... and ..., where I is the 3 identity matrix. Does the inverse matrix A^sup -1^^sub *^ exist? Is there an easy test to tell if such an inverse A^sup -1^^sub *^ exists? And if so, how do we compute A^sup -1^^sub *^?

The following discussion shows that A^sup -1^^sub *^ exists and .... Using this, it is easy to compute the following: .... Thus, we have recovered the SSN:

Therefore, the existence of A^sup -1^^sub *^, the multiplicative ... inverse of the matrix A, guarantees that the two SSNs will not generate the same SID and that the original SSN can be recovered from the SID.

THE MATHEMATICS OF BLOCK CIPHERS

The block cipher techniques presented in this paper are centered at the concept of the inverse of a matrix. From matrix algebra, there is one important number associated with a matrix A that can be used to determine if A is invertible. It is called the determinant of A, denoted by det(A). And if the det(A) 0 , then matrix A is invertible [6]. For ..., we obtain that det(A) = 117 . Since 117 0, matrix A is invertible and A^sup -1^ exists. And the inverse is ....

How does the matrix multiplication modulo 10 ... change the condition for the existence of the inverse of a matrix, denoted by A^sup -1^^sub *^? Recall that the operation ... is the composition of the matrix multiplication with the modular function. After obtaining the inverse matrix A^sup -1^, we rewrite

then apply the modular function to A^sup -1^ as follows. First, we will find the inverse of 117 mod 10, or, in general, the inverse of det(^4) modulo 10, that is, (det(A))^sup -1^ mod 10, if (det(A))^sup -1^ mod 10 exists. We already have the result that (det(A))^sup -1^ mod 10 exists if and only if gcd(det(A),10) = 1 . In our case, gcd(det(/l),10) = gcd(117,10) = 1 , so (det(A))^sup -1^ mod 10 does exist. Using the facts that 117modl0 = 7 and 117^sup -1^ mod 10 = (117 mod 10)^sup -1^ mod 10 , we have (117)^sup -1^ mod 10 = 3 . Then we will compute

A^sup -1^ mod 10 = (117)^sup -1^ ...

Let's check our result:

Therefore, A^sup -1^^sub *^ = A^sup -1^ mod10 . These observations lead to the fact that A^sup -1^^sub *^ exists if and only if det(A) has an inverse modulo 10, in other words, if and only if (det(A))^sup -1^ mod 10 exists. We summarize the result in following proposition:

Proposition: A matrix A has an inverse modulo 10, A^sup -1^^sub *^ if and only gcd(det(A),10) = l. Moreover, A^sup -1^^sub *^ = A^sup -1^ mod 10, A^sup -1^ is the inverse of A.

The block cipher using matrix multiplication modulo 10 ... allows us to encrypt a 9-digit SSN to a 9-digit SID. If the matrix served as a is invertible under matrix multiplication ..., then the same key can be to retrieve the SSN from the SID.

COMPUTER IMPLEMENTATION DEMONSTRATION

We will develop a computer program to implement the cipher for demonstration purpose. It is easy to generate SIDs from using Microsoft Excel once we select the matrix A, the key of encryption method that satisfies the condition of the Proposition. Figure 1 shows a simple data-entry screen of Excel file: Generating SIDs. xls, automatically generates an SID displayed in cell E8 after the user enters SSN into cell E4. The part of spreadsheet to the right of column G Figure 1, which contains the matrix A and formulas used to calculate the nine-digit SID in cells K8:S8, will be protected and hidden. This Excel file only uses Excel functions and formulas to implement the SID generating process.

However, we do not expect that every school system to use the same matrix A to generate SIDs. As a matter of fact, each school should have its own unique key and keep it in a safe place. To demonstrate how a school can obtain its own key, we provide an Excel Visual Basic Application (VBA) program to search for the key of the block cipher. This program named findMatrix will randomly generate a series 3 by 3 matrices and output the first one that is mod 10 invertible along with its inverse mod 10 matrix.

Sub findMatrix()

Dim m(3, 3) As Integer, mlnv(3, 3) As Integer

Dim i As Integer, j As Integer

Dim det As Single, detMod As Integer, detlnv As Integer

Dim iter As Integer

Dim m Find As Boolean

Dim ser As String

iter = 1

m Find = False

Do While iter < 1000 And m Find = False

' Randomly generate a 3 by 3 integer matrix

For i = 1 To 3

For j = 1 To 3

m(i, j) = WorksheetFunction.Round(Rnd() * 9, 0)

Next j

Next i

' Calculate the determinant det

det = m(1, 1 ) * (m(2, 2) * m(3, 3) - m(2, 3) * m(3, 2))

det = det - m(1 , 2) * (m(2, 1 ) * m(3, 3) - m(2, 3) * m(3, 1 ))

det = det + m(1 , 3) * (m(2, 1 ) * m(3, 2) - m(2, 2) * m(3, 1 ))

' Make det positive

Do While det < 0

det = det + 10

Loop

' Check if matrix is mod 10 invertible

If det Mod 10 <> 0 Then

If det/2-lnt(det/2)<>0Then

If det / 5 - lnt(det / 5) <> 0 Then

mFind = True

End If

End If

End If

' Adding iteration counter by 1

iter = iter + 1

' End DO loop

Loop

If mFind = True Then

iter = iter - 1

' Define a string (ser) that will be used late

Select Case iter

Case Is = 1

ser = "st"

Case Is = 2

ser = "nd"

Case Is = 3

ser = "rd"

Case Is > 3

ser = "th"

End Select

With Sheets("Output").Range("A1 ")

.Offset(0, O) = "Matrix Found at " & iter & ser & " random try"

' Report the matrix

For i = 1 To 3

For j = 1 To 3

.Offset(i, j) = m(i, j)

Next j

Next i

detMod=detMod10

Select Case detMod

Case Is = 1

detlnv = 1

Case Is = 3

detlnv = 7

Case Is = 7

detlnv = 3

Case Is = 9

detlnv = 9

End Select

' Start calculating the inverse matrix

mlnv(1, 1) = detlnv * (m(2, 2) * m(3, 3) * * m(3, 2) * m(2, 3))

mlnv(2, 1) = -detlnv * (m(2, 1) * m(3, 3) -m(3,1)*m(2,3))

mlnv(3, 1 ) = detlnv * (m(2, 1 ) * m(3, 2) - m(3,1)*m(2,2))

mlnv(1 , 2) = -detlnv * (m(1 , 2) * m(3, 3) -m(3,2)*m(1,3))

mlnv(2, 2) = detlnv * (m(1, 1) * m(3, 3) - *m(3,1)*m(1,3))

mlnv(3, 2) = -detlnv * (m(1 , 1 ) * m(3, 2) -m(3, 1)*m(1,2))

mlnv(1 , 3) = detlnv * (m(1 , 2) * m(2, 3) - m(2,2)*m(1,3))

mlnv(2, 3) = -detlnv * (m(1, 1) * m(2, 3) -m(2,1)*m(1,3))

mlnv(3, 3) = detlnv * (m(1 , 1 ) * m(2, 2) - m(2, 1)*m(1,2))

For i = 1 To 3

For j = 1 To 3

Do While mlnv(i,j)<0

mlnv(i,j) = mlnv(i, j) + 10

Loop

mlnv(i, j) = mlnv(i, j) Mod 10

Next j

Next i

.Offset(12, 0) = 'The inverse matrix with mod 10 is:"

' Report the inverse matrix

For i = 1 To 3

For j = 1 To 3

.Offset(i + 12,j) = mlnv(i,j)

Next j

Next i

End With

Else

End If

End Sub

For each school, the VBA search program findMatrix should be executed only once to randomly generate a key and then send the resulting matrix to the Excel file: Generating SIDs. xls. At the same time, the corresponding inverse matrix will be computed by the search program and will be saved for decrypting the SSNs from SIDs. The above Excel spreadsheet and VBA matrix search program serve as a demonstration on how easily our block cipher method can be implemented in a computer system.

IMPLICATIONS

Our method provides a simple way to generate SIDs from SSNs. Retrieving the SSN from the SID is effortless with the knowledge of the key. Therefore, we no longer need to physically store the SSNs. This practice protects the SSNs from accidental loss and theft, and greatly lessens the risk of identity theft. In fact, we have achieved one of the two goals of a cryptologist, which is to provide an easy and inexpensive method to encrypt and decrypt messages for authorized users.

The other goal is to make sure that an unauthorized user's task of determining the original message is difficult and expensive. We have the following features of the method in terms of its security:

1. We note that the same digit in the SSN does not necessarily appear as the same digit in the SID because of due process of matrix multiplication. For example, the number 3 in the SSN is enciphered differently, first 6 and then 8. Similarly, the repeated digits 5 in the SID represent a different digit in the SSN, first 1 and then 4 and 0, with repeated digits 0.

2. The number of keys: We shall have many 3x3 matrices with entries from 0, 1, 2, .... 9} that satisfy the conditions presented in the Proposition. Each matrix is a key that can be used to encrypt SSNs and decrypt SIDs. By a search program using VBA programming, we estimate that there are about a quarter of one billion of such matrices.

We may strengthen this cipher by applying a different matrix to each of the three blocks. In fact, it is almost necessary to do so for the following reason. If a student guesses correctly the encryption scheme, but does not know the key, he/she may use his/her SSN abc-de-fghi and SID Imn-opqrst to discover the key. (Realize that this is a BIG if.) He/She can set up a matrix ... based on his/her SSN and a matrix matrix ... based on his/her SID. If gcd(det(X),10) = 1 , then he/she will be able to discover the key, matrix A, by using the equation A = Y X^sup -1^^sub *^ . This is true because the block cipher can convert the three blocks of the SSN all at once by calculating A X to get Y, as seen in the equation A X = Y , so A can be calculated by Y X^sup -1^^sub *^ . For instance, if John Smith has an SSN 380-52-1094 and an SID 451-964-785, he can set up the matrix ... for his SSN and the matrix ... for his SID. Since the det(X) = -163 and gcd(-163,10) = l, the multiplicative inverse, X^sup -1^^sub *^, exists. By using the method described earlier in the paper, he can compute that ... and compute Y X^sup -1^^sub *^ to figure out the key, as shown below. Y X^sup -1^^sub *^ =

In practice, block ciphers are secure only when the block size is relatively large, as explained above. For our study, we could use the entire SSN as one block and turn it into a 9x1 matrix. Consequently, we would choose any 9x9 modular invertible matrix to serve as a key. The argument presented in this paper will be sufficient to demonstrate that the method still works. A 9x9 matrix has 81 entries and makes it hard to be recovered. The amount of calculations involved in encryption would make it expensive to break the cipher.

As suggested earlier, we may use three 3x3 modular invertible matrices B, C, and D to encrypt an SSN abc-de-fghi, one for each block of three digits. It is equivalent to use the 9x9 matrix A consisting of B, C, and D on its diagonal as follows: ... where 0 is a 3x3 matrix with all entries 0. Thus, it is a special case of using any 9x9 invertible matrix. In this case, there are 27 entries that may take any number from the set 0, 1, 2, ..., 9}, three times more than the original method. It can be considered as an intermediate step towards a more secure encryption algorithm.

FUTURE STUDY

Although this method is relatively secure, it still can be broken by sophisticated "code breakers". There are many variations that can be done to strengthen this cipher, such as breaking the SSN in different ways, combining the block cipher with other methods, like the Caesar Cipher or permutation ciphers.

In this paper, we have only analyzed the cryptography part of this method. The reader may also want to study its cryptanalysis.

[Reference]

REFERENCES

1. Thomas P. Dence and Steven Heath, "Using Pi in Cryptology", Mathematics and Computer Education, ISSN: 0730-8639 (Winter 2005).

2. George Gaberlavage and Neal Walters, "Protecting Social Security Numbers from Identity Theft", http://www.aarp.org/research/fraudsscams/fraud/fsl22_id_theft.html (September 2005).

3. Michael Hamilton and Bill Yankosky, "The Vigenère Cipher with the TI-83", Mathematics and Computer Education, ISSN: 0730-8639 (Winter 2004).

4. Darrel W. Hardy and Carol L. Walker, Applied Algebra: Codes, Ciphers, and Discrete Algorithms, Prentice Hall, ISSN: 0-13-067464-8 (2003).

5. "Identity Theft and Your Social Security Number: What you Need to Know" https://www.trusted id.com/html/identity_theft_protection_resource_018.php, (2007).

6. David C . Lay, Linear Algebra and Its Applications, 2nd Edition, Addison Wesley, ISSN: 0-201-82478-2 (1996).

7. Simon Singh, The Code Book: The Evolution of Secrecy From Mary, Queen of Scots to Quantum Cryptography, Random House, New York, NY, ISSN: 0-385-49531-5 (1999).

[Author Affiliation]

Katherine Zhu *

Plainview - Old Bethpage John Fitzgerald Kennedy High School

50 Kennedy Drive

Plainview, New York 11803

kzhu25@gmail.com

Yajun Yang

Farmingdale State College

State University of New York

Farmingdale, New York 11735

yangy@farmingdale.edu

[Author Affiliation]

* When the research for this article began, this author was a high school junior.

• Questia's entire collection
• Automatic bibliography creation
• More helpful research tools like notes, citations, and highlights