这是狼吗？了解机器学习中的偏见|KDE

作者：Cameron Boozarjomehri

当您查看上面的图像时，您看到的只是雪中的狼。你怎么知道这是雪地里的狼？好吧，我可以长时间切线，说明您的眼睛如何收到大脑将其解释为区域和形状的光线，您的思想与您期望狼的外观密切相关。

简而言之，您知道这是一只狼，因为您知道狼的样子。当然，这种处理图像或真实事物的能力当然不仅适用于狼。每当您的大脑想识别出您所看的任何东西时，神经互动和视觉解释的过程都会发生。这是您的大脑知道如何建立这些连接的唯一原因是因为您有意识地或潜意识地训练它，以了解尖头的耳朵，长长的鼻子，蓬松的皮毛，四个腿和尾巴通常表示“狼”。

Photo courtesy of Pixabay

现在看看上面的狗。您会再次注意到尖尖的耳朵，长长的鼻子，蓬松的皮毛，腿。狼的所有明显迹象，但这不是狼。这是一只狗（特别是沙哑）。作为人类，您如何进行这种区别？您如何了解狼和沙哑之间的区别？是轮廓吗？皮毛的颜色？衣领？有人可能会说，所有这些事情都足够不同，您的大脑不会想到狼，而是狗。

When it comes to机器学习，正是这些微妙的差异试图弄清楚图像中的哪些信息很重要。但是，随着机器学习的成熟，我们开始意识到我们对这些工具如何得出结论的理解有多么有限。

混音中的斜切！

This is where MITRE comes in. Through MITRE research, we are taking steps to assess new ways for algorithms to give us feedback. Machine learning is a valuable tool, but its black box nature can make it difficult to rely on in the real world.

Photo courtesy of Pixabay

例如，想象一下我采用了我们的算法，并展示了上面的Corgi的新图像。如果该算法对毛皮的颜色增加了更多的重量，则可以确定这是狼。如果它专注于头部的轮廓或衣领的对比度，那么它可能决定它是狗。问题在于我已经对算法进行了训练，以至于我对它的性能感到满意，但是现在我希望它给我一个答案。某些机器学习工具将图像分解为形状，颜色和图案，以辨别有关这些元素的信息。在这种情况下，每个元素用于计算可以组合以产生结果的权重或分数，在这种情况下，狼或狗。在最有限的情况下，它可能只是告诉我狗或狼。如果我们使它更加详细，那么它可能会告诉我们它如何加权不同元素的决定。

现在想象我们要重新设计算法。如果我们能得到它来突出图像中的结果会影响结果，那会更好吗？我们可能会发现我们的算法说这是一只狗，不是因为科吉本身，而是因为图像背景中的雪和缺乏树木。这个重点关注图像意外功能的机器学习分类器的示例来自研究工作，以更好地了解不透明的黑匣子模型如何进行分类。生产的工具，酸橙，is an example of an active research community focused on “Explainable AI”.达帕is making significant investments in this, as are many major AI-focused companies.

These are the kinds of questions MITRE staff are exploring with the intent of creating human-usable tools that emphasize which information is most valuable to any classification. The real importance of this work comes back to bias. In the case above, the stakes are fairly low, but machine learning has significant applications beyond categorizing wildlife.

在现实世界中会发生什么？

Photo courtesy of Pixabay

Training machines to learn is in one sense conceptually similar to the way that humans learn and make decisions. Humans learn to make reasonable decisions based on their exposure to experiences in their environment throughout their lifetime. If human learning or decisions are made in the context of poor data or poor experiences, then there is the risk that an individual will make undesirable decisions or decisions that do not accurately interpret the data presented to them. For instance, they may see a wolf in the wild and try to pet it!

Similarly, machines learn based on the data to which they are exposed. Like humans, undesirable outcomes – potentially becoming life-threatening or illegal – could arise from machines being trained upon data that is not representative across a multitude of scenarios or populations, that unintentionally favors some outcomes over others, or has been intentionally manipulated for malicious purposes by an insider or external entity. Here is a very real-world scenario that clarifies that what you train on really matters.

最近，加利福尼亚州法院裁定，他们的现金保释系统被认为是违宪的拒绝正当程序。法院之所以做出这一决定，是因为它发现那些确定保释的人如何将本已被剥夺最受欢迎的社会成员边缘化，尤其是穷人和有色人种[1]。为了解决此问题，他们引入了SB10，该法案规定了“审前风险评估”的新程序。结果，使用验证的风险评估工具，将通过审前评估服务进行审前风险评估。该工具的目的是通过提供有关一个人未出庭的风险来补救偏见。[2]

法律中可以理解使用“风险评估工具”的使用是指一种算法，该算法将采用“逮捕和定罪历史和其他数据[1]”，目的是通过使用算法，我们正在提供一种感觉。对风险评估过程的客观性。不幸的是，就像我们在上面探索的算法示例一样，这意味着有机会在“确定被告的风险”中发挥重要作用。

考虑该法案的基本性质。它是为了取代被认为违宪的过程而创建的，因为它不成比例地影响了有色人种和有色人种。但是，我们将在哪里获得新的SB10符合算法的培训数据？可能通过观察过去的案件来告知某人是飞行风险的可能性，这些案件的特征是偏见，因此导致了该法案的需求。

Furthermore, assume that we continue to tweak our algorithm over time. Ideally, this adjustment would mean making corrections based on observed outcomes that differ from the algorithm’s own predictions. This adjustment creates further opportunity for bias by causing a feedback loop of sorts, thereby strengthening any bias that may have existed during the algorithm’s creation and training.

那么，我们该何去何从？

关键不是我们将能够完全消除偏见，而是如果我们希望依靠这些工具，我们必须理解它。通常，偏见几乎可以来自任何地方：可用数据的限制，显示数据的方式，新数据如何影响算法随着时间的推移。我们需要可以帮助我们探索这些因素并确定偏见来源的工具，因此我们可以理解是否需要考虑它。即使我们可以创建一种完美的算法，也不应该是为何做出决定的唯一权威，尤其是在真空中做出决定时。我们需要的工具不仅可以计算结果，还可以解释为达到结果所采取的逻辑步骤。

在上面的SB10案例中，根据先前被捕期间的过去行为来确定某人作为高风险在文化上是可以接受的。但是，我们需要确保他们的记录是计算结果的主要理由，而不是有关被告生命的第三级信息，而这些信息应与该案无关。[3]

These are the kinds of questions and considerations MITRE staff explore every day. MITRE’s subject matter experts are already working to understand the inherent bias common to any collection of data so that we can recognize it before the training begins. Through this and other ongoing work, MITRE hopes to continue to contribute to better machine learning support. This way we can appreciate exactly which information from any input is used to explain an outcome, and furthermore build models we can confidently apply to the real world. This is just one of many ways MITRE is solving problems for a safer world.

Cameron Boozarjomehri is a Software Engineer and a member of MITRE’s Privacy Capability. His passion is exploring the applications and implications of emerging technologies and finding new ways to make those technologies accessible to the public.

[1] Westervelt，E。（2018年10月2日）。改革者说，加利福尼亚的保释大修可能会造成更大的危害。访问：https://www.npr.org/2018/10/02/651959950/californias-bail-over-haul-may-may-do-more-harm-harm-harm-than-than-than-than good-reformers-say

[2]加利福尼亚立法信息。（nd）。参议院第10号法案。访问：https://leginfo.legislature.ca.gov/faces/billnavclient.xhtml?bill_id=201720180SB10

[3] Levin，S。（2018年9月7日）。被算法监禁：加利福尼亚的黑暗面结束现金保释。访问：https://www.theguardian.com/us-万博下载包news/2018/sep/07/imprisoned-by-algorithms-the-dark-side-california-california-dend-cash-cash bail

Miter的任务驱动团队致力于解决更安全的世界的问题。了解有关MITER的更多信息。

也可以看看：

世界将会：劳动力发展内外

稍后接您：回顾AI一代网络挑战

网络钓鱼，旗帜和课程计划：即将到来的AI Nexus的黑客马拉松

数据科学的技术挑战

在斜切中定义，应用和协调数据科学

Upgrading Machine Learning. Install Brain (Y/N)?