Efficient Learning of Sparse Representationswith an

Efficient Learning of Sparse Representationswith an


2024年3月12日发(作者:游戏app平台排行榜)

EfficientLearningofSparseRepresentations

withanEnergy-BasedModel

Marc’AurelioRanzatoChristopherPoultneySumitChopraYannLeCun

CourantInstituteofMathematicalSciences

NewYorkUniversity,NewYork,NY10003

{ranzato,crispy,sumit,yann}@

Abstract

Wedescribeanovelunsupervisedmethodforlearningsparse,overcompletefea-

elusesalinearencoder,andalineardecoderprecededbyaspar-

sifyingnon-linearitythatturnsacodevectorintoaquasi-binarysparsecodevec-

ninput,theoptimalcodeminimizesthedistancebetweentheoutput

ofthedecoderandtheinputpatchwhilebeingassimilaraspossibletotheen-

ngproceedsinatwo-phaseEM-likefashion:(1)compute

theminimum-energycodevector,(2)adjusttheparametersoftheencoderandde-

elproduces“strokedetectors”when

trainedonhandwrittennumerals,andGabor-likefilterswhentrainedonnatural

nceandlearningareveryfast,requiringnopreprocessing,

heproposedunsupervisedmethodtoinitialize

thefirstlayerofaconvolutionalnetwork,weachievedanerrorrateslightlylower

y,anextensionofthe

methodisdescribedtolearntopographicalfiltermaps.

1Introduction

Unsupervisedlearningmethodsareoftenusedtoproducepre-processorsandfeatureextractorsfor

rmethodssuchasWaveletdecomposition,PCA,Kernel-PCA,Non-

NegativeMatrixFactorization[1],andICAproducecompactrepresentationswithsomewhatuncor-

related(orindependent)components[2].Mostmethodsproducerepresentationsthateitherpreserve

r,severalrecentworkshaveadvocatedtheuse

ofsparse-overcompleterepresentationsforimages,inwhichthedimensionofthefeaturevectoris

largerthanthedimensionoftheinput,butonlyasmallnumberofcomponentsarenon-zerofor

anyoneimage[3,4].Sparse-overcompleterepresentationspresentseveralpotentialadvantages.

Usinghigh-dimensionalrepresentationsincreasesthelikelihoodthatimagecategorieswillbeeasily

(possiblylinearly)representationscanprovideasimpleinterpretationoftheinput

dataintermsofasmallnumberof“parts”r-

more,thereisconsiderableevidencethatbiologicalvisionusessparserepresentationsinearlyvisual

areas[5,6].

Itseemsreasonabletoconsiderarepresentation“complete”ifitispossibletoreconstructtheinput

fromit,becausetheinformationcontainedintheinputwouldneedtobepreservedintherepresen-

supervisedlearningmethodsforfeatureextractionarebasedonthisprinciple,

andoder

takestheinputandcomputesacodevector,forexampleasparseandovercompleterepresentation.

Thedecodertakesthecodevectorgivenbytheencoderandproducesareconstructionofthein-

randdecoderaretrainedinsuchawaythatreconstructionsprovidedbythedecoder

areassimilaraspossibletotheactualinputdata,whentheseinputdatahavethesamestatistics

ssuchasVectorQuantization,PCA,auto-encoders[7],Restricted

BoltzmannMachines[8],andothers[9]haveexactlythisarchitecturebutwithdifferentconstraints

onthecodeandlearningalgorithms,

otherapproaches,theencodingmoduleismissingbutitsroleistakenbyaminimizationincode

spacewhichretrievestherepresentation[3].Likewise,innon-causalmodelsthedecodingmodule

ismissingandsamplingtechniquesmustbeusedtoreconstructtheinputfromacode[4].Insec.2,

wederaining,

theencoderallowsveryfastinferencebecausefindingarepresentationdoesnotrequiresolvingan

oderprovidesaneasywaytoreconstructinputvectors,thusallowing

thetrainertoassessdirectlywhethertherepresentationextractsmostoftheinformationfromthe

input.

Mostmethodsfindrep

ordertolearnsparserepresentations,rmusually

penalizesthosecodeunitsthatareactive,aimingtomakethedistributionoftheiractivitieshighly

peakedatzerowithheavytails[10][4].Adrawbackfortheseapproachesisthatsomeaction

mightneedtobetakeninordertopreventthesystemfromalwaysactivatingthesamefewunitsand

collapsingalltheotherstozero[3].

anon-linearity,inthesystem[11].Thisingeneralforcesalltheunitstohavethesamedegreeof

sparsity,paper,we

presentasystemwhichachievessparsitybyplacinganon-linearitybetweenencoderanddecoder.

Sec.2.1describesthismodule,dubbedthe“SparsifyingLogistic”,whichisalogisticfunctionwith

n-linearityisparameterizedinasimple

waywhichallowsustocontrolthedegreeofsparsityoftherepresentationaswellastheentropyof

eachcodeunit.

Unfortunately,learningtheparametersinencoderanddecodercannotbeachievedbysimpleback-

propagationofthegradientsofthereconstructionerror:theSparsifyingLogisticishighlynon-linear

ore,insec.3wepropose

toaugmentthelossfunctionbyconsideringnotonlytheparametersofthesystembutalsothe

tingthefactthat1)itis

fairlyeasytodeterminetheweightsinencoderanddecoderwhen“good”codesaregiven,and2)

itisstraightforwardtocomputetheoptimalcodeswhentheparametersinencoderanddecoderare

fixed,wedescribeasimpleiterativecoordinatedescentoptimizationtolearntheparametersofthe

cedurecanbeseenasasortofdeterministicversionoftheEMalgorithminwhich

rningalgorithmdescribedturnsouttobe

particularlysimple,-processingisrequiredfortheinputimages,beyonda

.4wereportexperimentsoffeatureextractionon

esystemhasalinearencoderanddecoder

(rememberthattheSparsifyingLogisticisaseparatemodule),thefiltersresemble“objectparts”for

thenumerals,andlocalized,ngthesefeatures

fortheclassificationofthedigitsintheMNISTdataset,wehaveachievedbyasmallmarginthe

ludebyshowingahierarchicalextensionwhich

suggeststheformofsimpleandcomplexcellreceptivefields,andleadstoatopographiclayoutof

thefilterswhichisreminiscentofthetopographicmapsfoundinareaV1ofthevisualcortex.

2TheModel

Theproposedmodelisbasedonthreemaincomponents,asshowninfig.1:

•Theencoder:Asetoffeed-forwardfiltersparameterizedbytherowsofmatrixW

C

,that

computesacodevectorfromanimagepatchX.

•TheSparsifyingLogistic:Anon-linearmodulethattransformsthecodevectorZintoa

¯

withcomponentsintherange[0,1].sparsecodevectorZ

•Thedecoder:AsetofreversefiltersparameterizedbythecolumnsofmatrixW

D

,that

¯

.computesareconstructionoftheinputimagepatchfromthesparsecodevectorZ

Theenergyofthesystemisthesumoftwoterms:

E(X,Z,W

C

,W

D

)=E

C

(X,Z,W

C

)+E

D

(X,Z,W

D

)(1)

Thefirsttermisthecodepredictionenergywhichmeasuresthediscrepancybetweentheoutputof

xperiments,itisdefinedas:

1

1

||Z−Enc(X,W

C

)||

2

=||Z−W

C

X||

2

(2)

22

Thesecondtermisthereconstructionenergywhichmeasuresthediscrepancybetweentherecon-

xperiments,it

E

C

(X,Z,W

C

)=


发布者:admin,转转请注明出处:http://www.yc00.com/xitong/1710249960a1726619.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信