> I'm going to keep this short and link to a longer post  if you want more
> background. Basically I'm hoping that all of us will be able to agree on one
> method of handling these invalid torrents and implement that for the next
> release of our software. We should also update the unofficial spec
) and also try to get the
> official spec ammended with the results of this discussion.
> Essentially the problem is that if a .torrent file exists which has unsorted
> dictionary keys  in its info dictionary then there are three ways in
> which it can be parsed and two possible infohashes which can be generated:
> 1) You can decode the info dictionary, order the keys (as per spec) then
> generate the infohash using the sorted keys.
> 2) You can take a substring from the .torrent file which spans the info
> dictionary and just run those raw bytes through a SHA1 hash and generate the
> info hash. This generates a *different* infohash as to method 1.
> 3) Discard the torrent as invalid and refuse to process.
> So pros and cons:
> Approach 1:
> Pro: This approach implies that if an invalid bencoded dictionary is found
> it should be converted into a valid representation and used.
> Pro: This should be relatively trivial for most clients to implement.
> Con: The letter of the spec  says that we should always use a substring
> of the .torrent metadata. Strictly speaking this approach goes against the
> spec. However, we have to assume that when the spec refers to ".torrent
> metadata' it refers to *specification compliant* .torrent metadata. i.e. the
> keys in the metadata must be sorted for it to be considered valid .torrent
> metadata. Conversely, if the keys are *not* sorted then the data should not
> be considered valid .torrent metadata. If that's the case, then why are you
> generating a valid infohash from invalid metadata?
> Con: You need a non-spec compliant way of decoding bencoded data.
> Approach 2:
> Pro: You follow the letter of the bittorrent specification but still break
> the BEncoding specification.
> Con: I'd argue that this is slightly more complex to implement as you may
> now need to double parse the .torrent file in order to generate the
> infohash. The first time you pass it through your bencoded data decoder to
> generate your in-memory representation. Then you have to parse the file
> manually a second time to find the start and end of the info dictionary and
> extract that substring.
> Con: I'd argue that you're breaking the spirit of the specification again as
> you're now running under the assumption that invalid bencoded data is valid
> .torrent metadata and the spec doesn't explicitly allow this ;)
> Approach 3:
> Pro: Simplest to implement as you just have to make your bencoded data
> decoder spec compliant. Once this happens, the .torrent won't be loadable so
> that question of how to generate the infohash never has to be answered.
> Con: It will render some torrents unloadable but those should be a very
> small percentage.
>  http://forum.utorrent.com/viewtopic.php?pid=431793#p431793
>  http://wiki.theory.org/BitTorrentSpecification#dictionaries
>  "The 20 byte sha1 hash of the bencoded form of the info value from the
> metainfo file. Note that this is a substring of the metainfo file."