נושא הפרוייקט

מספר פרוייקט מחלקה שמות סטודנטים אימייל שמות מנחים

שימוש במתקפת Membership Inference על מודלים שאומנו עם נתונים טבלאיים על מנת להעריך את הסיכון לדליפת מידע ממודל נתון

Utilize a membership inference attack on models trained with tabular data to assess the risk of information leakage from a given model

תקציר בעיברית

עם השימוש ההולך וגובר במודלים של למידת מכונה בתחומים שונים, החששות לגבי הפרטיות והאבטחה של נתונים רגישים המעובדים על ידי מודלים אלה הפכו לחשיבות עליונה. מחקר זה בוחן את השימוש בMembership Inference Attack כדי להעריך את הסיכון לדליפת מידע ממודלים שאומנו עם נתונים טבלאיים.
המטרה של תקיפת Membership Inference היא לקבוע אם רשומה מסוימת שימשה כחלק ממערך האימון המשמש לבניית מודל.
במחקר זה, אנו מתמקדים במיוחד במודלים שאומנו עם נתונים טבלאיים, אשר נפוצים בדרך כלל בתחומים כגון פיננסים, שירותי בריאות וניתוח לקוחות.
להלן דוגמאות לשני תרחישים מעשיים שיכולים להניע מחקר זה: ראשית, הוכחה שנעשה שימוש לא מורשה בנתונים ממשלתיים לצורך אימון מודלים. שנית, הערכת הפגיעות של מודל ממשלתי לMembership Inference Attack.
אנו מציעים טכניקה עבור MIA על ידי ניתוח העקביות של התחזיות של המודל כאשר הקלט משתנה. השיטה מבוססת על ההשערה שרשומות ממערך האימון נוטות יותר להציג סיווגים עקביים למרות שנערכים שינויים ברשומה, בעוד שלרשומות אשר אינן שייכות למערך האימון יש סיכוי גבוה יותר שהסיווגים שלהן ישתנו באופן ניכר.
התקיפה מתבצעת במספר שלבים. בתחילה, התוקף מתשאל את המודל עם סט רשומות ומקבל את הסיווגים עבורן מהמודל. זה מספק קו בסיס להשוואה.
התוקף מבצע פרטורבציה בקלט באופן איטרטיבי, בוחן האם התרחשו שינויים בסיווגים של המודל, ובמידה והסיווג עבור רשומה מסוימת השתנה הוא מודד את עוצמת הפרטורבציה. אם העוצמה חורגת מסף מוגדר מראש, התוקף מסיק שהרשומה הינה ככל הנראה חלק ממערך האימון. מצד שני, אם העוצמה נמוכה מהסף, התוקף מסיק שסביר להניח שהרשומה אינה חלק ממערך האימון.
הערכת הניסוי כוללת סט מגוון של מערכי נתונים טבלאיים ומגוון מודלים של למידת מכונה המשמשים בדרך כלל עבור נתונים טבלאיים. אנו מודדים את האפקטיביות של התקיפה על ידי כימות שיעור ההצלחה בMembership Inference של רשומות בצורה נכונה.

תקציר באנגלית

With the increasing adoption of machine learning models in various domains, concerns about the privacy and security of sensitive data processed by these models have become paramount. This research investigates the utilization of a membership inference attack to assess the risk of information leakage from models trained with tabular data.
Membership inference attacks aim to determine whether a particular data point was part of the training dataset used to build a model. 
In this study, we focus specifically on models trained with tabular data, which is commonly encountered in domains such as finance, healthcare, and customer analytics. 
Examples of two practical scenarios that could drive this research: First, proving unauthorized use of government data for model training. Second, evaluating vulnerability of government model to membership inference attack.
We propose a technique for MIA by analyzing the consistency of the model's predictions when the input is modified. The method is based on the hypothesis that records from the training set are more likely to exhibit consistent classifications despite perturbations, whereas non-members are more likely to have their classifications change noticeably.
The attack proceeds in several stages. Initially, the attacker queries the model with a set of records and receives the corresponding classification labels from the model. This provides a baseline for comparison.
The attacker performs a perturbation in the input iteratively, examines whether changes have occurred in the classifications of the model, and if the classification for a particular record has changed, he measures the magnitude of the perturbation. If the magnitude exceeds a predefined threshold, the attacker concludes that the record is most likely part of the training set. On the other hand, if the magnitude is lower than the threshold, the attacker concludes that it is likely that the record is not part of the training set.
The experimental evaluation involves a diverse set of tabular datasets and a range of machine learning models commonly employed in tabular data analysis. We measure the effectiveness of the membership inference attack by quantifying the success rate in correctly inferring membership of data points.