Krinkle | 1 Apr 01:17 2011
Picon

Re: Toolserver Intuition - Tech spec (Toolserver goes I18N!)

Platonides wrote:
> Krinkle wrote:
>> -- Other features
>>
>> * Variable replacement ($1, $2, etc.)
>>
>> * Fallback languages:
>> * Getting language names
>>
>> * Escaping (ie. options = array( escape => html )
>
> How much duplication will TsIntuition have with MediaWiki i18n code?

The basic principle is the same: Loading messages from i18n php files,
translating at TranslateWiki, getting messages with fallback, etc..

However the main difference will be it's more basic / simplified.
* No registered users
* No dependencies on other code or database connections to be made.
* Other than replacing variables and gender/plural there will be no
  parsing (ie. no ''markup'', {{templates}}, [[links]], {{#or}},  
<whatever>)
* No 'site language' vs. user language.
* Language/gender choice come from cookies and/or browser agent.
Rather then database user account preferences retrieval.
* Isolated text domains.
* No converters, backwards compatibility or wiki-environment factors
to take into account.

One could describe TsIntuition as a lightweight i18n system for  
(Continue reading)

Brett Hillebrand | 1 Apr 09:00 2011
Picon

Privacy Violation?

I have reason to believe that one of Betacommand's tools is currently
violating the Toolserver's privacy policy by profiling individual users
editing times and edited articles for comparative reasons as seen at:

http://toolserver.org/~betacommand/UserCompare/

This information would not normally be aggregated in such a way that it is
easily obtainable and would require some effort and maths to do so.

"Tools that allow profiling of individual user's activity (beyond what can
easily be achieved directly on the public wiki sites) must only be applied
with the respective user's consent (opt-in)."

Whilst such information can be used against abusive editing, it promotes a
gross violation of privacy.

Cheers,

Brett Hillebrand
User:Promethean  <at>  en_wiki
ACC Developer

This message and it's attachments may contain confidential information that
is intended only for the individual named. If you are not the named
addressee you should not disseminate, distribute or copy this e-mail. Please
notify the sender immediately by e-mail if you have received this e-mail by
mistake and delete this e-mail from your system. E-mail transmission cannot
be guaranteed to be secure or error-free as information could be
intercepted, corrupted, lost, destroyed, arrive late or incomplete, or
contain viruses. The sender therefore does not accept liability for any
(Continue reading)

Platonides | 1 Apr 10:00 2011
Picon

Re: Privacy Violation?

Brett Hillebrand wrote:
> I have reason to believe that one of Betacommand's tools is currently
> violating the Toolserver's privacy policy by profiling individual users
> editing times and edited articles for comparative reasons as seen at:
> 
> http://toolserver.org/~betacommand/UserCompare/
> 
> This information would not normally be aggregated in such a way that it is
> easily obtainable and would require some effort and maths to do so.
> 
> "Tools that allow profiling of individual user's activity (beyond what can
> easily be achieved directly on the public wiki sites) must only be applied
> with the respective user's consent (opt-in)."
> 
> Whilst such information can be used against abusive editing, it promotes a
> gross violation of privacy.
> 
> Cheers,
> 
> Brett Hillebrand
> User:Promethean  <at>  en_wiki
> ACC Developer

Have you talked about your concerns to BetaCommand?

PS: You should have created a new mail for the new topic. Otherwise it
gets threaded incorrectly.
It is not an uncommon sin, but those people at least don't use to create
a new topic by top-posting the new one.

(Continue reading)

John | 1 Apr 12:52 2011
Picon

Re: Privacy Violation?

He has not, and the data collected via user-compare is generated solely via data collected from the API and almost exclusively used for  SPI http://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations where gathering and analyzing this data is standard practice. Had I been using non-public data (anything generated from the sql databases that normal users do not have access to) I would agree that there may be privacy issues, however every piece of data that is used for that tool comes from the en.wikipedia.org API.


Betacommand

<div><p>He has not, and the data collected via user-compare is generated solely via data collected from the API and almost exclusively used for&nbsp; SPI <a href="http://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations">http://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations</a> where gathering and analyzing this data is standard practice. Had I been using non-public data (anything generated from the sql databases that normal users do not have access to) I would agree that there may be privacy issues, however every piece of data that is used for that tool comes from the <a href="http://en.wikipedia.org">en.wikipedia.org</a> API.<br><br><br>Betacommand<br></p></div>
Daniel Kinzler | 1 Apr 13:21 2011
Picon

Re: Privacy Violation?

Hi John

The fact that you are using public data for your analysis does NOT mean that
it's compliant with the policy.

In fact, this policy was put into place precisely to make clear that even when
using public data, making available an analysis may STILL constitute a privacy
violation. from the toolserver policy page:

"analysis of publically available data (data mining) may well lead to
information that compromizes the privacy of individuals (profiling). The fact
that anyone could in theory perform this analysis does not justify the
publication of such information."

Making this kind of information available to a closed circle of users entrusted
by the community with special powers, such as admins with checkuser privileges,
os one thing. Making them available to the public is quite another.

Please make sure that your tools do make available any analysis that allows
insight into peoples habits or lifestyle, beyond what is easily and directly
visible on wikipedia itself.

Regards,
Daniel

On 01.04.2011 12:52, John wrote:
> He has not, and the data collected via user-compare is generated solely via data
> collected from the API and almost exclusively used for  SPI
> http://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations where gathering
> and analyzing this data is standard practice. Had I been using non-public data
> (anything generated from the sql databases that normal users do not have access
> to) I would agree that there may be privacy issues, however every piece of data
> that is used for that tool comes from the en.wikipedia.org
> <http://en.wikipedia.org> API.

Carl (CBM | 1 Apr 13:28 2011
Picon

Re: Privacy Violation?

On Fri, Apr 1, 2011 at 7:21 AM, Daniel Kinzler <daniel <at> brightbyte.de> wrote:
> The fact that you are using public data for your analysis does NOT mean that
> it's compliant with the policy.

As I understand it, there are three options for tools like this:

* Limit the tool to trusted users, so that the analysis is not
publicly available.  This is the simplest option, but you should check
with the TS admins whether this would be acceptable.

* Run the tool on a host other than toolserver and don't use the
toolserver databases as a data source. Then the toolserver privacy
policy doesn't apply.

* Get consent from the users whose data is being analyzed. This is
impractical for investigating sockpuppets.

The underlying source of the privacy policy is that the toolserver is
associated with WIkimedia Deutschland, and German privacy law is not
the same as U.S. privacy law.

- Carl

John | 1 Apr 13:39 2011
Picon

Re: Privacy Violation?

All that the tool does is merge [[Special:Contributions]] of multiple users and shows pages that multiple accounts have edited in common. Im really not sure how that could be considered a privacy issue.

Betacommand

On Fri, Apr 1, 2011 at 7:28 AM, Carl (CBM) <cbm.wikipedia <at> gmail.com> wrote:
On Fri, Apr 1, 2011 at 7:21 AM, Daniel Kinzler <daniel <at> brightbyte.de> wrote:
> The fact that you are using public data for your analysis does NOT mean that
> it's compliant with the policy.

As I understand it, there are three options for tools like this:

* Limit the tool to trusted users, so that the analysis is not
publicly available.  This is the simplest option, but you should check
with the TS admins whether this would be acceptable.

* Run the tool on a host other than toolserver and don't use the
toolserver databases as a data source. Then the toolserver privacy
policy doesn't apply.

* Get consent from the users whose data is being analyzed. This is
impractical for investigating sockpuppets.

The underlying source of the privacy policy is that the toolserver is
associated with WIkimedia Deutschland, and German privacy law is not
the same as U.S. privacy law.

- Carl

_______________________________________________
Toolserver-l mailing list (Toolserver-l <at> lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette

<div>
<p>All that the tool does is merge [[Special:Contributions]] of multiple users and shows pages that multiple accounts have edited in common. Im really not sure how that could be considered a privacy issue. <br><br>Betacommand<br><br></p>
<div class="gmail_quote">On Fri, Apr 1, 2011 at 7:28 AM, Carl (CBM) <span dir="ltr">&lt;<a href="mailto:cbm.wikipedia <at> gmail.com">cbm.wikipedia <at> gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote">
<div class="im">On Fri, Apr 1, 2011 at 7:21 AM, Daniel Kinzler &lt;<a href="mailto:daniel <at> brightbyte.de">daniel <at> brightbyte.de</a>&gt; wrote:<br>
&gt; The fact that you are using public data for your analysis does NOT mean that<br>
&gt; it's compliant with the policy.<br><br>
</div>As I understand it, there are three options for tools like this:<br><br>
* Limit the tool to trusted users, so that the analysis is not<br>
publicly available. &nbsp;This is the simplest option, but you should check<br>
with the TS admins whether this would be acceptable.<br><br>
* Run the tool on a host other than toolserver and don't use the<br>
toolserver databases as a data source. Then the toolserver privacy<br>
policy doesn't apply.<br><br>
* Get consent from the users whose data is being analyzed. This is<br>
impractical for investigating sockpuppets.<br><br>
The underlying source of the privacy policy is that the toolserver is<br>
associated with WIkimedia Deutschland, and German privacy law is not<br>
the same as U.S. privacy law.<br><br>
- Carl<br><div>
<div></div>
<div class="h5">
<br>
_______________________________________________<br>
Toolserver-l mailing list (<a href="mailto:Toolserver-l <at> lists.wikimedia.org">Toolserver-l <at> lists.wikimedia.org</a>)<br><a href="https://lists.wikimedia.org/mailman/listinfo/toolserver-l" target="_blank">https://lists.wikimedia.org/mailman/listinfo/toolserver-l</a><br>
Posting guidelines for this list: <a href="https://wiki.toolserver.org/view/Mailing_list_etiquette" target="_blank">https://wiki.toolserver.org/view/Mailing_list_etiquette</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
Brett Hillebrand | 1 Apr 13:41 2011
Picon

Re: Privacy Violation?

The aggregation or representation of individual users editing habits such as what time they mostly edit and then overlapping these edits and times with other users is a violation of Toolserver privacy policy wether it is using the API, Database or otherwise because its information that reveals the users lifestyle (the time bit especially) and such information would not normally be so easily available especially if the user has 1000’s of edits.

 

Whether it is primarily used by SPI or not is irrelevant as at present you have no control over who views that info and for what purpose, you also seem to keep these reports wether the accounts within were socks or not.

 

I did not contact you directly as possible Privacy violations of any sort is a matter of interest for Toolserver Staff and other Toolserver users who may or may not run similar scripts.

 

Cheers,

 

Brett Hillebrand

User:Promethean <at> en_wiki

ACC Developer

 

This message and it's attachments may contain confidential information that is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission.

 

From: toolserver-l-bounces <at> lists.wikimedia.org [mailto:toolserver-l-bounces <at> lists.wikimedia.org] On Behalf Of John
Sent: Friday, 1 April 2011 9:22 PM
To: toolserver-l <at> lists.wikimedia.org
Cc: Platonides
Subject: Re: [Toolserver-l] Privacy Violation?

 

He has not, and the data collected via user-compare is generated solely via data collected from the API and almost exclusively used for  SPI http://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations where gathering and analyzing this data is standard practice. Had I been using non-public data (anything generated from the sql databases that normal users do not have access to) I would agree that there may be privacy issues, however every piece of data that is used for that tool comes from the en.wikipedia.org API.


Betacommand

<div><div class="WordSection1">
<p class="MsoNormal"><span>The aggregation or representation of individual users editing habits such as what time they mostly edit and then overlapping these edits and times with other users is a violation of Toolserver privacy policy wether it is using the API, Database or otherwise because its information that reveals the users lifestyle (the time bit especially) and such information would not normally be so easily available especially if the user has 1000&rsquo;s of edits.<p></p></span></p>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span>Whether it is primarily used by SPI or not is irrelevant as at present you have no control over who views that info and for what purpose, you also seem to keep these reports wether the accounts within were socks or not.<p></p></span></p>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span>I did not contact you directly as possible Privacy violations of any sort is a matter of interest for Toolserver Staff and other Toolserver users who may or may not run similar scripts.<p></p></span></p>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span>Cheers,<p></p></span></p>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span>Brett Hillebrand<p></p></span></p>
<p class="MsoNormal"><span>User:Promethean  <at>  en_wiki<p></p></span></p>
<p class="MsoNormal"><span>ACC Developer<p></p></span></p>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span>This message and it's attachments may contain confidential information that is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission.<p></p></span></p>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>
<div><p class="MsoNormal"><span lang="EN-US">From:</span><span lang="EN-US"> toolserver-l-bounces <at> lists.wikimedia.org [mailto:toolserver-l-bounces <at> lists.wikimedia.org] On Behalf Of John<br>Sent: Friday, 1 April 2011 9:22 PM<br>To: toolserver-l <at> lists.wikimedia.org<br>Cc: Platonides<br>Subject: Re: [Toolserver-l] Privacy Violation?<p></p></span></p></div>
<p class="MsoNormal"><p>&nbsp;</p></p>
<p class="MsoNormal">He has not, and the data collected via user-compare is generated solely via data collected from the API and almost exclusively used for&nbsp; SPI <a href="http://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations">http://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations</a> where gathering and analyzing this data is standard practice. Had I been using non-public data (anything generated from the sql databases that normal users do not have access to) I would agree that there may be privacy issues, however every piece of data that is used for that tool comes from the <a href="http://en.wikipedia.org">en.wikipedia.org</a> API.<br><br><br>Betacommand<p></p></p>
</div></div>
Daniel Kinzler | 1 Apr 14:18 2011
Picon

Re: Privacy Violation?

On 01.04.2011 13:41, Brett Hillebrand wrote:
> The aggregation or representation of individual users editing habits such as
> *what time they mostly edit and then overlapping these edits and times with
> other users* is a violation of Toolserver privacy policy wether it is using the
> API, Database or otherwise because its information that reveals the users
> lifestyle (the time bit especially) and such information would not normally be
> so easily available especially if the user has 1000’s of edits.

 <at> Brett: Can you please specify what in the reports generated by betacommand you
deem probelematic? Is it only the "Normal edit time" bit, or did I miss something?

I think the main feature here is finding the co-authorship sets for two or three
users. I think that by itself should generally be fine, though with some effort
I could construct some case where it *might* compromise someones privacy...

 <at> John: would it significantly reduce the usefulness of your tool if you removed
the "Normal edit time" bit? Also, could you throw away the reports on a regular
basis, once they are no longer needed? I think that would already help a bit.

> I did not contact you directly as possible Privacy violations of any sort is a
> matter of interest for Toolserver Staff and other Toolserver users who may or
> may not run similar scripts.

Well, I'd recommend to talk to the person in question directly as a first
option. Seems more friendly than screaming bloody murder right away. On the
other hand, if someone is persistently violating policy, talking to the admins
is of course the right thing to do.

Regards,
Daniel

_______________________________________________
Toolserver-l mailing list (Toolserver-l <at> lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Stephen Bain | 1 Apr 14:28 2011
Picon

Re: Privacy Violation?

On Fri, Apr 1, 2011 at 6:00 PM, Brett Hillebrand
<bretthillebrand <at> internode.on.net> wrote:
>
> "Tools that allow profiling of individual user's activity (beyond what can
> easily be achieved directly on the public wiki sites) must only be applied
> with the respective user's consent (opt-in)."

Well the policy is pretty vague (indeed, you have quoted the whole of it there).

What counts as profiling and what does not? And what "can easily be
achieved" using only the wiki?

The editing overlap can be reproduced quite straightforwardly using
Special:Contributions and article history pages (or the API), perhaps
with the aid of a pencil and paper and the browser's search function
for the larger sets. There could be quite a bit of labour in that
though. Does that count as "easy"?

--

-- 
Stephen Bain
stephen.bain <at> gmail.com


Gmane