[ Pobierz całość w formacie PDF ]
ASenseof`Danger'forWindowsProcesses
?
SalmanManzoor,M.ZubairSha¯q,S.MominaTabish,MuddassarFarooq
NextGenerationIntelligentNetworksResearchCenter(nexGINRC)
FASTNationalUniversityofComputer&EmergingSciences(NUCES)
Islamabad,44000,Pakistan
f
salman.manzoor,zubair.shafiq,momina.tabish,muddassar.farooq
g
@
nexginrc.org
Abstract.Thesophisticationofmoderncomputermalwaredemands
run-timemalwaredetectionstrategieswhicharenotonlye±cientbut
alsorobusttoobfuscationandevasionattempts.Inthispaper,wein-
vestigatethesuitabilityofrecentlyproposedDendriticCellAlgorithms
(DCA),bothclassicalDCA(cDCA)anddeterministicDCA(dDCA),for
malwaredetectionatrun-time.WehavecollectedAPIcalltracesofreal
malwareandbenignprocessesrunningonWindowsoperatingsystem.
WeevaluatetheaccuracyofcDCAanddDCAforclassifyingbetween
malwareandbenignprocessesusingAPIcallsequences.Moreover,we
alsostudythee®ectsof
antigenmultiplier
and
time-windows
onthe
detectionaccuracyofbothalgorithms.
Keywords:APICallSequence,Arti¯cialImmuneSystem,Dendritic
CellAlgorithm,MalwareDetection
1Introduction
Thesophisticatedcomputermalwareisbecomingaseriousthreattotheinfor-
mationtechnologyinfrastructure,whichisthebackboneofmoderne-commerce
systems[2].Arecentoutbreakof
Conficker
malwarea®ectedmorethan9
millioncomputersincludingthoseofMinistryofDefence,UnitedKingdom[3].
Thisincidenthasprovedthatcommercialanti-virussoftware,evenwithupdated
malwarede¯nitions,areincapableofsafeguardingourinformationtechnologyin-
frastructure.In[7],theauthorshaveshownthatcommercialanti-virussoftware
areeasilybefooledusingevasionattempts,suchascodeobfuscation,encryption
andpolymorphictransformations.Therefore,securityexpertsarenowfocusing
theirattentiontorobust
run-time
malwaredetectiontechniquesthatanalyze
APIcallsequenceofaprocesstoclassifyitas
benign
or
malicious
.Intuitively,
such
dynamic
techniquesareresilienttotheabove-mentionedevasionattempts
becausemalwarehastoeventuallyexecutethemaliciousactivity.
Arti¯cialImmuneSystems(AIS)haveservedasanaturalsourceofinspi-
rationtodevelopdynamicsystemsforprocessclassi¯cation.The¯eldofAIS
wasinitiallydominatedbythe
self/nonself
theory,whichmodelstheworking
?
ApologiestoForrestetal.[9].
2 Salmanetal.
of
adaptive
immunesystem.Forrestetal.initiallyusedtheideaof
self/nonself
todevelopthe
negativeselectionalgorithm
(NSA)[9].Initially,NSAwasused
toclassifyacomputerprocessasbenignormalicious.NSAhasbeenincremen-
tallyimprovedandseveraladvancedversionsarenowavailable,suchasthe
real-valuedNSA[10],the
randomized
real-valuedNSA[11],andthereal-valued
NSAwithvariablesizeddetectors[17].However,Stiboretal.carriedoutsev-
eralexperimentstoevaluatetheappropriatenessofNSAforanomalydetection.
Theauthorsshowedthatnegativeselectionalgorithmisnotsuitableforhigher
dimensionaldatasets[20],[21].
AISresearchcommunityhasrecentlyturneditsattentiontoanewgenera-
tionofimmune-inspiredAISalgorithmswhichmimictheworkingmodelofthe
innateimmunesystem
[4].Thefundamentalprincipleofsuchalgorithmsisthat
theinnateimmunesystemrespondsto`danger'insteadof`nonself'.Aickelinet
al.proposedanewAISalgorithmcalled
DendriticCellAlgorithm
(DCA)to
overcometheabove-mentionedshortcomingsinNSA[6],[12].
TheclassicalDCA(cDCA)consistsofanumberofcontextspeci¯cstochastic
variableswhichmakesitdi±culttosystematicallyanalyzeagiventask.Con-
sequently,Greensmithetal.[15]proposedasimpli¯edandmorepredictable
versionofDCAwhichiscalleddeterministicDCA(dDCA).Sinceitsoriginal
inception,twomajorimprovementsareproposedforDCAnamely
antigenmul-
tiplier
and
time-windows
.Guetal.haveinitiallyinvestigatedtheusefulnessof
theseconceptsforDCA[16].
Inthisstudy,weinvestigatetherelativemerits/de-meritsofcDCAand
dDCA,coupledwith
antigenmultiplier
and
time-windows
concepts,formal-
waredetection.Inordertoensurereal-worldrelevance,wehavecollectedAPI
calltracesbyrunning100benignand416maliciousWindowsexecutablesin
avirtualenvironment.
1
Themaliciousexecutablesincludetrojans,virusesand
worms.Wequantifythee±cacyofDCAintermsofitsdetectionaccuracy.
Theremainingpaperisorganizedasfollows:Section2presentsasummary
ofrelatedwork.Section3providesanoverviewoftheDCAanditsvariations.
InSection4,weexplainthecollectionprocessofAPIcalltracesforreal-world
malwareandbenignprocesses.Section5describesourexperimentalsetupand
presentsthedetaileddiscussionsonempiricalresults.InSection6,webrie°y
discussmajorlimitationsofDCAandtheirpotentialcountermeasures.Finally,
weconcludeourpaperinSection7.
2RelatedWork
AISshaveservedasanaturalsourceofinspirationfordesigninganomalydetec-
tionsystems.Tomaintainfocus,weonlydiscussthemostrelevantresearch.
ClassicalAISalgorithmsareinspiredbytheworkingofadaptiveimmune
systemwhichfollowsprinciplesoftheself/nonselftheory.Inthisparadigm,NSA
hasattainedthestatusofadefactostandard.ItwasproposedbyForrestetal.for
1
Thedatasetsusedinthispaperareavailableat
http://www.nexginrc.org
.
ASenseof`Danger'forWindowsProcesses 3
classi¯cationofanomalousprocessesinacomputersystem[9].Severaladvanced
versionsofNSAhavebeenproposedtodatewhichincludebutarenotlimited
tothereal-valuedNSA[10],the
randomized
real-valuedNSA[11],andthereal-
valuedNSAwithvariablesizeddetectors[17].TheadvancedversionsofNSA
improveitsscalability,spacecoverage,convergencetimeandformaltreatment.
Evenwiththeabove-mentionedimprovements,NSAhasbeenwidelycriticized
forpoorscalabilitybehaviorespeciallyathigherdimensions[20],[21].
DangertheoryproposedbyMatzinger[19]claimsthatimmunesystemworks
bysensing`danger'.In[6],theauthorsinvestigatedthefeasibilityofusingdan-
gertheorytodevelopanewparadigmofAISalgorithmsfornetworkintrusion
detection.In[12],Greensmithetal.developedanovelDCAbasedonthecon-
ceptsof
dangertheory
.TheauthorssuccessfullyappliedcDCAforclassi¯cation
ofbreastcancerdataset.In[13]and[14],theauthorsusedcDCAforSYNscan
detection.
SincetheseminalworkofGreensmithetal.,severalvariationsofDCAhave
beenproposed.In[16],theauthorsenhancedDCAwithtwoadditionalfeatures,
called
antigenmultiplier
and
time-windows
.DCAreliesonaggregatesampling
oftheantigensforeventualclassi¯cation;therefore,
antigenmultiplier
wasused
toimprovesamplingprocess.Eachantigenwasmultiplied10,50and100times
tostudythee®ectofmultiplesampling.Theauthorsalsoused
time-windows
to
studytheaggregatee®ectofsignals.Theyused¯xedtime-windowsof2,3,5,
7and10instances.TheyalsocomparedDCAwithNSAandC4
:
5decisiontree
forbenchmarkcomparison.
In[15],theauthorsproposedthedDCA.SeveralstochasticvariablesofcDCA
wereremovedtounderstandthemerits/demeritsofthecorealgorithm.Three
relevantmodi¯cationsintroducedindDCAwere:(1)asimplesignalprocessing
procedure,(2)contextevaluationbasedononefactor,(
¹
k
),whichwasusedto
ultimatelycalculateananomalyscore
K
®
,and(3)anewmetric(
T
k
)wasde¯ned
todeterminethresholdfor
K
®
.Theauthorsevaluatedthedetectionaccuracyof
dDCAusingthePINGscandataset.Inthenextsection,weprovideadetailed
introductionofDCAanditsvariations.
3DendriticCellAlgorithmanditsvariations
Dendriticcells(DCs),oftheinnateimmunesystem,arethecorecomponentof
DCA.Theyhavetheabilitytosensetheinternalconditionsofatissuebydetect-
ingvarioussignals.A
safe
signalisproducedinaneventofnaturalcelldeath
(apoptosis),whichre°ectsthenormalenvironmentofatissue.Onthecontrary,
unnaturaldeathofcells(necrosis)becauseofinjuryorpathogenicinfectionleads
tothereleaseof
danger
signals.Anotherstrongindicatorofpotentiallyharmful
environmentis
pathogenassociatedmolecularpattern
(PAMP).
NewlybornDCsareinanimmaturestateandscouratissueforantigens(sus-
pect)andsignals(evidence).Antigensandsignalstogetherevaluatethecontext
ofatissueas
benign
or
potentiallymalicious
.DCsdistinguishbetweencontexts
bytakingdi®erentpathwaystotheirmaturity.A
fullymatured
stateofaDC
4 Salmanetal.
istheresultofexposuretohigherconcentrationofdangerandPAMPsignals.
Likewise,
semi-matured
stateofaDCdepictsexposuretohigherconcentration
ofsafesignals.AcollectiveassessmentofDCpopulationactivatesorsuppresses
theimmuneresponse.Wenowexplainthedetailsofdi®erentvariationsofDCA.
3.1ClassicalDCA(cDCA)
IncDCA,proposedbyGreensmithetal.[12],apopulationof100DCsismain-
tained.EachDCisassignedarandommigrationthresholdwhichlimitsthe
amountoftimeitspendsinatissue.Asubsetofpopulationisrandomlysam-
pledtoformasamplingpoolofantigens.TheselectedDCsspendtimeina
tissuetocollectantigensandsignals.Theinputsignalsaremultipliedwithpre-
de¯nedweightstocalculateoutputsignals.Inthispaper,forcDCA,wehave
usedsameweightvaluesasproposedbytheauthorsin[12].Threeoutputsig-
nals(
O
0
;O
1
;O
2
)arecalculatedforeachDCas:
O
i
=
P
j
=2
j
=0
W
ij
S
j
;8i
,where
i
referstothecategoryofoutputsignal,
j
referstothecategoryofinputsignal,
W
istheweightmatrix,
S
istheinputsignalvectorand
O
istheoutputsignal
vector.
O
0
iscostimulatorysignal(csm)anditmigratestothelymphnodeifthe
valueofcsmexceedsassignedmigrationthreshold.Inordertoderiveacontext,
DCcomputestwomoreoutputs:(1)thesemi-maturecontext(
O
1
),and(2)the
maturecontext(
O
2
).Thevaluesarecomparedwithoneanotherandtheoverall
contextistermedassafeif
O
1
isgreaterthan
O
2
,andvice-versa.
DCsthathavelivedtheirallottedspanmigratetothelymphnode.The
antigensandtheircorrespondingcontextsaresavedtoalog¯le.Eachantigen
issampledmultipletimessothatitcanappearindi®erentcontextsinalog¯le.
Inordertodetectpotentiallymaliciousantigens,theyaretaggedwithamature
contextantigenvalue(
MCAV
).
MCAV
foraparticularantigen
i
,(
MCAV
i
),
isderivedbydividingthenumberoftimesthatantigen(
Ag
i
)hasappearedin
thedangercontext(
N
di
)bytotalnumberofappearances(
N
i
).Mathematically,
MCAV
i
=
N
di
N
i
.
Athreshold(
T
)isappliedto
MCAV
tomakethe¯nalclassi¯cationdecision.
Theantigenswith
MCAV
higherthan
T
aretermed
malicious
,andvice-versa.
Let
³
m
bethenumberofanomalousinstancesand
³
bethetotalnumberof
instancesinadataset.Wecande¯ne
T
=
³
m
³
.
3.2DeterministicDendriticCellAlgorithm(dDCA)
TheDCAhasprovidedpromisingclassi¯cationaccuracyresultsonanumberof
benchmarkdatasets[12],[13].However,thebasicDCAusesseveralstochastic
variableswhichmakeitssystematicanalysisverydi±cult.Inordertomitigate
thisproblem,theauthorsin[15]haveproposedsomechangesincDCA.Thenew
variationofDCA,calleddDCA,hasfollowingenhancedfeatures:
{Threeinputsignalcategoriesarereducedtotwo,i.e.dangerandsafesignal;
{Randommigrationthresholdisreplacedwithuniformdistributionoflifespan
valuesinapopulation;
ASenseof`Danger'forWindowsProcesses 5
{Dedicatedstorageandsamplingofantigensisreplacedwithsamplingofall
antigensbyDCs;
{Insteadofformingasamplingpool,thesignals'dataisprocessedbyallDCs.
Asaresult,outputsignalsarecalculatedonceforpopulationofDCs;
{Onlyonefactor(
¹
k
)iscalculatedforeachDCtoarriveatacontext.Negative
valuesof
¹
k
re°ectabenigncontextandpositivevaluesindicateamalicious
context.
Signalprocessingissimpli¯edbyreducingthenumberofinputsignalsand
usingaweightassigningscheme.Twooutputsarecalculated:(1)accumulation
ofsignals(csm),and(2)score(
¹
k
),towhichthethresholdisappliedforclassi-
¯cation.
csm
isde¯nedas
csm
=
D¡S
,and
¹
k
=
D¡
2
S
,where
D
and
S
are
valuesofdangerandsafesignalsrespectively.Anewparameter
K
®
isde¯ned
usingthevaluesof
¹
k
.Itspurposeistoprovidereal-valuedscores.
K
®
isde¯ned
as
K
®
=
P
m
k
m
m
®
m
,where
k
m
isthe
¹
k
valuefor
DC
m
,and
®
m
isthenumberof
antigensoftype
®
presentedby
DC
m
.Moreover,athresholdparameter(
T
k
)
isalsode¯ned.Thevaluesof
K
®
greaterthanthevalueof
T
k
depictmalicious
contextandsmallervaluesindicatebenignbehavior.
T
k
isde¯nedas
T
k
=
S
k
:
¹
i
P
I
s
,
where
I
s
isthetotalnumberofinstancesinadataset,
¹
i
isthemeannumberof
iterationsperincarnationofaDC,and
S
k
=
P
I
s
D¡
2
P
I
s
S
.
3.3AntigenMultiplier
DCAhasbeenmostlyutilizedfordataminingproblems.Mostofthedatasets
usedfordataminingcontainonlyonecopyofeachinstance(orantigen).Inorder
toassessthetypeofanantigen,itshouldbepresentedmultipletimessothat
MCAV
valuecanbegeneratedforit.Theconceptofantigenmultipliercatersfor
thisrequirement[16].Eachantigeniscopiedmultipletimesinthetissueantigen
vector.Theclassi¯cationdecisionisnowaveragedoverthereplicatedpopulation.
Intuitively,replicatinganantigenshouldhelpinimprovingtheclassi¯cation
accuracy.
3.4MovingTime-Windows
Thesignalsinourbodydonotdiesuddenly;rather,theyfadeslowlyovera
periodoftime.Thistemporale®ectofsignalsiscapturedbyintroducingthe
conceptofmovingtime-windowsinDCA[16].Newsignalsarecomputedusing:
N
ij
=
1
w
P
i
+
w
n
=
i
O
nj
;8j
,where
N
ij
isnewsignalvalueof
i
th
antigenof
j
th
category,
w
isthewindowsizeand
O
ij
isoriginalsignalof
i
th
antigenand
j
th
category.Newsignals(
N
)aretheaverageofoldsignals(
O
)inaparticular
time-window.Intuitivelyspeaking,averagingofsignalsreducesthenoiseininput
signals.
[ Pobierz całość w formacie PDF ]