پديد آورنده :
كتيرايي، نرگس
عنوان :
كشف سرقت ادبي در متن
مقطع تحصيلي :
كارشناسي ارشد
محل تحصيل :
اصفهان: دانشگاه صنعتي اصفهان، دانشكده برق و كامپيوتر
صفحه شمار :
ده،104ص.: مصور،جدول،نمودار
يادداشت :
ص.ع.به فارسي و انگليسي
استاد راهنما :
محمدعلي منتظري
توصيفگر ها :
طبقه بندي تك برجسته ي متن , شبكه ي عصبي , تقلب كلمه به كلمه , تقلب با ابهام كم
تاريخ نمايه سازي :
10/9/92
استاد داور :
عبدالرضا ميرزايي، ناصر قديري
دانشكده :
مهندسي برق و كامپيوتر
چكيده فارسي :
به فارسي و انگليسي: قابل رويت در نسخه ديجيتالي
چكيده انگليسي :
554 Plagiarism Detection in Text Narges Katirae n katirae@ec iut ac ir Date of Submission 2013 09 21 Department of Electrical and Computer Engineering Isfahan University of Technology Isfahan Iran Degree M Sc Language Farsi Supervisor Mohammad A Montazeri montazeri@cc iut ac irAbstract With the fast growth of computer documents and the expanding internet access to ideas articles and other technical documents are much easier and more convenient This resulted in the rapidexchange of the information and a vast growth of plagiarism Since in universities and othereducational establishments scientific resources are more easily accessible to users thereforeplagiarism is in use by some users Based on the importance of this issue and the copy right law toprotect the individuals writers and scholar from the fraud use of other people research results a lotof research has been done on the subject of plagiarism in recent years As a result the researchershave found ways to detect plagiarism in texts such as dissertations papers and other scientificreports And these efforts have directed the researchers to increase the accuracy and efficiency ofthese methods There are two important and controversial issues in the detection of plagiarism The first issue isthe number of texts that are compared The number of texts are sometimes very high which couldbe more than a few thousand text to be compared Therefore the first step is to find texts that aremore likely to be suspicious text One way of reducing the number of comparisons is to classifytexts Because naturally people use text which are somehow related to their subject and secondapproach to find the exact location of the stolen phrases In this thesis a two phase approach is used which consists of several steps in order to detectplagiarism In the first phase a neural network classifier is proposed to classify single label texts The classifier uses competitive rules and error correction and a geometric sequence to correct theweights between words and topics With this classifier each suspicious text is compared only withtheir class texts therefore the number of comparisons between suspicious and original texts of theare reduced In the second phase after the preprocessing of texts primarily due to the highprobability of being suspected source texts of the class some of them which are more similar to asuspicious text is found Then using the same keywords and phrases in the original and suspiciouspares of the sentence of suspected text and the formula for finding the percentage of similaritybetween two sentences is presented In the third stage using a threshold and sequence of similarsentences where two text phrases fraud is found Finally due to existence some potential errorsand discrete expressions a three step post processing algorithm is applied to the discovered terms In this research the method proposed to detect fraud and fraud in the context of their verbatim orwith low ambiguity is presented To compare this work to related research the work is carried outin two phases In the first phase proposed classifier compare to the nearest neighbor method hasmuch better accuracy This classifier also when the number of topics are too many has much betterresults compare to Na ve Bayes method but the performance is weaker than Support VectorMachine Comparing In the second phase comparing with the top four race detect plagiarismPAN10 the result of proposed fraud detection it showed that this algorithm works weaker than thefirst race and better than the other three Keywords Classification of single label text neural networks plagiarism verbatim cheating cheating withlow ambiguity
استاد راهنما :
محمدعلي منتظري
استاد داور :
عبدالرضا ميرزايي، ناصر قديري