Excel VBA to find and mask PAN data using regex for PCI DSS compliance -


because of tools discover credit card data in file systems no more list suspicious files, tools needed mask data in files must retained.

for excel files, loads of credit card data may exist, figure macro detects credit card data in selected column/row using regex , replaces middle 6-8 digits xs useful many. sadly, i'm not guru in regex macro space.

the below works regex 3 card brands only, , works if pan in cell other data (e.g. comments fields)

the below code works, improved. improve regex make work more/all card brands , reduce false-positives including luhn algorithm check.

improvements/problems remaining :

  • match card brand's pans expanded regex
  • include luhn algorithm checking (fixed - idea ron)
  • improve while logic (fixed stribizhev)
  • even better handling of cells don't contain pans (fixed)

here's have far seems working ok amex, visa , mastercard:

sub pci_mask_card_numbers() ' written mask credit card numbers in excel files in accordance pci dss. ' highlight credit card data in excel sheet, run macro.  dim strpattern string: strpattern = "([4][0-9]{3})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})|" & _ "([5][0-9]{3})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})|" & _ "([3][0-9]{2})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})|" & _ "([3][0-9]{3})([^a-za-z0-9_]?[0-9]{3})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})|" & _ "([3][0-9]{3})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{3})([^a-za-z0-9_]?[0-9]{4})|" & _ "([3][0-9]{3})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{4})([^a-za-z0-9_]?[0-9]{3})|" & _ "([3][0-9]{3})([^a-za-z0-9_]?[0-9]{6})([^a-za-z0-9_]?[0-9]{5})"  ' regex patterns pans above broken multiple parts (between brackets) ' such when regex matches first part of pan fit 1 of rmatch(k).submatches(#) # 0, 4, 8, 12, 16, 20 or 24.  ' visa start 4 , 16 digits long. typically data entry pattern 4 groups of 4 digits ' mastercard start 5 , 16 digits long. typically data entry pattern 4 groups of 4 digits ' amex start 3 , 15 digits long. typically pattern 4-6-5, data entry seems inconsistent      dim strreplace string: strreplace = "" '     dim regex new regexp  ' if line used instead of next 2, ms vbs regex v5.5 needs enabled manually. next 2 lines seem within script     dim regex object     set regex = createobject("vbscript.regexp")     dim regex new regexp     dim strinput string     dim myrange range     dim newpan string     dim aproblem string     dim masked long     dim problems long     dim total long  regex     .global = true     .multiline = true     .ignorecase = false     .pattern = strpattern ' sets regex pattern match pattern above end  set myrange = selection      msgbox ("the macro start masking credit card numbers identified in selected cells only. if entire columns selected, each column take 10-30 seconds complete. ditto rows.")  each cell in myrange     total = total + 1      ' check cell candidate holding pan, not long number     if strpattern <> "" _     , cell.hasformula = false _     , left(cell.numberformat, 1) <> "$" _     , mid(cell.numberformat, 3, 1) <> "$" '        cell.numberformat = "@"         strinput = cell.value          ' depending on data matching regex pattern, fix         if regex.test(strinput)             set rmatch = regex.execute(strinput)             k = 0 rmatch.count - 1                 toreplace = rmatch(k).value          ' if regex matched, replace pan based on regex segment                 select case 2                     case < len(rmatch(k).submatches(0))                         strreplace = rmatch(k).submatches(0) & "xxxxxxxx" & trim(rmatch(k).submatches(3))                         masked = masked + 1                     case < len(rmatch(k).submatches(4))                         strreplace = rmatch(k).submatches(4) & "xxxxxxxx" & trim(rmatch(k).submatches(7))                         masked = masked + 1                     case < len(rmatch(k).submatches(8))                         strreplace = rmatch(k).submatches(8) & "xxxxxxxx" & trim(rmatch(k).submatches(11))                         masked = masked + 1                     case < len(rmatch(k).submatches(12))                         strreplace = rmatch(k).submatches(12) & "xxxxxxxx" & trim(rmatch(k).submatches(13))                         masked = masked + 1                     case < len(rmatch(k).submatches(16))                         strreplace = rmatch(k).submatches(16) & "xxxxxxxx" & trim(rmatch(k).submatches(19))                         masked = masked + 1                     case < len(rmatch(k).submatches(20))                         strreplace = rmatch(k).submatches(20) & "xxxxxxxx" & trim(rmatch(k).submatches(23))                         masked = masked + 1                     case < len(rmatch(k).submatches(24))                         strreplace = rmatch(k).submatches(24) & "xxxxxxxx" & trim(rmatch(k).submatches(26))                         masked = masked + 1                     case else                         aproblem = cell.value                         problems = problems + 1                         ' msgbox (aproblem) ' needed when curios                 end select                 if cell.value <> aproblem                     cell.value = replace(strinput, toreplace, strreplace)                 end if              next k         else             ' adds cell value variable allow macro move past cell             ' once macro trusted not loop forever, message box can removed             ' msgbox ("problem. regex fail? bad data = " & aproblem)         end if     end if next cell ' done, tell user     msgbox ("cardholder data masked" & vbcr & vbcr & "total cells highlighted (including blanks) = " & total & vbcr & "cells masked = " & masked & vbcr & "possible problem cells = " & problems & vbcr & "all other cells ignored")  end sub 

back vacation. here's simple vba function test luhn algorithm. argument string of digits; result boolean.

it generates checksum digit , compares digit 1 in digit string feed it.


option explicit function luhn(snum string) boolean 'modulus 10 algorithm various numbers dim x long, long, j long  = len(snum) - 1 1 step -2     x = x + doublesumdigits(mid(snum, i, 1))     if > 1 x = x + mid(snum, - 1, 1) next  if right(snum, 1) = (x * 9) mod 10     luhn = true else     luhn = false end if end function  function doublesumdigits(l long) long     dim x long     x = l * 2     if x > 9 x = val(left(x, 1)) + val(right(x, 1))     doublesumdigits = x end function 


Comments

Popular posts from this blog

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -

node.js - Using Node without global install -

php - CakePHP HttpSockets send array of paramms -