Pierre de Buyl | 4 Oct 22:10 2011
Picon
Picon

Re: tentative of synthesis: parameters, box size, particle number

Le 15 sept. 2011 à 21:28, Konrad Hinsen a écrit :
> On 14 sept. 11, at 09:39, Pierre de Buyl wrote:
> 
>> 2. box size
>>   - The offset should be given, each in their "H5MD containter" with step and time.
>>   - How different is what we do with respect to, for instance, PDB or gromacs ?
>>      see the manual of gromacs, chapter 3, §3.1 and §3.2 at http://www.gromacs.org/ second news item or
direct link ftp://ftp.gromacs.org/pub/manual/manual-4.5.4.pdf
>>     pdb http://www.wwpdb.org/documentation/format33/sect8.html
> 
> I don't know about Gromacs, but all about PDB. It's a format for storing crystal conformations, so they
follow crystallography conventions. It's also a format for single conformations, so time dependence is
not an issue.
> 
> The size of the unit cell is given by the three edge lengths (a, b, c, in Angstrom), and the shape by three
angles (alpha, beta, gamma, in degrees). The first lattice vector points along the x axis, the second lies
in the x-y plane. Symmetry is specified by the name of the symmetry group, which implies a set of symmetry
transformations and constraints on the unit cell shape, but those are supposed to be known (tabulated),
so they are not stored explicitly.
> 
> The PDB conventions are not very practical for simulations. Simulation programs need the lattice
vectors, not their lenghts and angles. The use of named symmetry groups isn't practical either, unless we
want all simulation programs to incorporate the tables published by the IUCr (International Union of Crystallographers).

Does anyone see a limitation to use the scheme proposed by Felix ?
box
  \-- offset
    \-- value [var][d] e.g., (-L_x/2, -L_y/2, -L_z/2)
    \-- time [var]
    \-- step [var]
(Continue reading)

Felix Höfling | 5 Oct 10:16 2011
Picon

Re: tentative of synthesis: parameters, box size, particle number

Am 04.10.2011, 22:10 Uhr, schrieb Pierre de Buyl
<pdebuyl@...>:

>
> Does anyone see a limitation to use the scheme proposed by Felix ?
> box
>   \-- offset
>     \-- value [var][d] e.g., (-L_x/2, -L_y/2, -L_z/2)
>     \-- time [var]
>     \-- step [var]
>   \-- edges
>     \-- value [var][d][d] e.g., ((L_x, 0, 0), (0, L_y, 0), (0, 0, L_z))
>     \-- time [var]
>     \-- step [var]
> The more elaborate scheme seems overkill.
> We could think of setting an attribute to the "box" group that would  
> indicate the type of box information.
> box
>   +-- kind = [ cubic | triclinic ]
>   +-- time_dependent = [ 0 | 1 ]
>
> Then, future revision could easily adapt it by adding box types.
>

I think the above scheme naturally includes triclinic boxes. What is then
the benefit of the attribute "kind"? If it is set to "cubic", one could
use scalars instead of matrices for the edge value. But such an approach
could easily result in an endless distinction of cases for H5MD readers.
On the other hand, knowing that the box is cuboid/orthorhombic would
simplify the computation of, e.g., the box volume. (In the present scheme,
(Continue reading)

Pierre de Buyl | 6 Oct 11:13 2011
Picon
Picon

Re: tentative of synthesis: parameters, box size, particle number

Le 5 oct. 2011 à 10:16, Felix Höfling a écrit :
> Am 04.10.2011, 22:10 Uhr, schrieb Pierre de Buyl
>> Does anyone see a limitation to use the scheme proposed by Felix ?
>> box
>>  \-- offset
>> (...)
>> The more elaborate scheme seems overkill.
>> We could think of setting an attribute to the "box" group that would indicate the type of box information.
>> box
>>  +-- kind = [ cubic | triclinic ]
>>  +-- time_dependent = [ 0 | 1 ]
>> 
>> Then, future revision could easily adapt it by adding box types.
>> 
> I think the above scheme naturally includes triclinic boxes. What is then
> the benefit of the attribute "kind"? If it is set to "cubic", one could
> use scalars instead of matrices for the edge value. But such an approach
> could easily result in an endless distinction of cases for H5MD readers.
> On the other hand, knowing that the box is cuboid/orthorhombic would
> simplify the computation of, e.g., the box volume. (In the present scheme,
> one needs to compute the determinant of the edges matrix—which is not hard
> if an algorithm like numpy.linalg.det is available.)
> 
> If we find it really necessary to provide additional meta information as
> box kind or time dependence, I would prefer a Boolean scheme that allows
> to combines these features independently (time_dependence: true/false,
> orthogonal: true/false, internal_symmetries: true/false). But I think that
> all the information can be retrieved as well from the dataset extents
> (time_dependence), their contents (orthogonality), or presence of
> attributes (transformation, see below).
(Continue reading)

Pierre de Buyl | 7 Oct 08:46 2011
Picon
Picon

Re: tentative of synthesis: parameters, box size, particle number

Le 15 sept. 2011 à 18:09, Felix Höfling a écrit :
> Am 14.09.2011, 09:39 Uhr, schrieb Pierre de Buyl <pdebuyl@...>:
>> Hi everyone,
>> 
>> There has been a lot of traffic on the list and I couldn't keep up. I'll try to make a synthesis of my
suggestions, remarks and questions.
>> 
>> Please, if possible, keep the discussion in this single thread so as to remain focused on these items.
>> 
>> 1. parameters "subgroups"
>>      - I think it is a good idea to devise a few (let us keep it to a minimum) shared parameters and to put program
parameters in "/parameters/program_name".
>>      - In my opinion, time-dependent information should not be in "parameters" but in "observables".
>> 
> The idea behind moving 'particle_number' and 'box' to /parameters was that they should become mandatory
for a H5MD file, while the /observables group is optional. I have encountered several situations where
both information is essential for the further interpretion of the data, independent of where the data
come from (trajectory or observables). Of course, they could be time-dependent datasets (as we have in
both trajectory and observables so far).

Well, "observables" is, in my view, all that changes during the simulation, which is the case of the box
size. The box volume, for instance, is a proper physical observable.

>> 2. box size
>>     - The offset should be given, each in their "H5MD containter" with step and time.
>>     - How different is what we do with respect to, for instance, PDB or gromacs ?
>>        see the manual of gromacs, chapter 3, §3.1 and §3.2 at http://www.gromacs.org/ second news item or
direct link ftp://ftp.gromacs.org/pub/manual/manual-4.5.4.pdf
>>       pdb http://www.wwpdb.org/documentation/format33/sect8.html
> Good question, we should look that up.
(Continue reading)

Felix Höfling | 7 Oct 10:02 2011
Picon

Re: tentative of synthesis: parameters, box size, particle number

Am 06.10.2011, 11:13 Uhr, schrieb Pierre de Buyl
<pdebuyl@...>:

> Le 5 oct. 2011 à 10:16, Felix Höfling a écrit :
>> Am 04.10.2011, 22:10 Uhr, schrieb Pierre de Buyl
>>> Does anyone see a limitation to use the scheme proposed by Felix ?
>>> box
>>>  \-- offset
>>> (...)
>>> The more elaborate scheme seems overkill.
>>> We could think of setting an attribute to the "box" group that would  
>>> indicate the type of box information.
>>> box
>>>  +-- kind = [ cubic | triclinic ]
>>>  +-- time_dependent = [ 0 | 1 ]
>>>
>>> Then, future revision could easily adapt it by adding box types.
>>>
>> I think the above scheme naturally includes triclinic boxes. What is  
>> then
>> the benefit of the attribute "kind"? If it is set to "cubic", one could
>> use scalars instead of matrices for the edge value. But such an approach
>> could easily result in an endless distinction of cases for H5MD readers.
>> On the other hand, knowing that the box is cuboid/orthorhombic would
>> simplify the computation of, e.g., the box volume. (In the present  
>> scheme,
>> one needs to compute the determinant of the edges matrix—which is not  
>> hard
>> if an algorithm like numpy.linalg.det is available.)
>>
(Continue reading)

Felix Höfling | 7 Oct 15:38 2011
Picon

Re: tentative of synthesis: parameters, box size, particle number

Am 07.10.2011, 08:46 Uhr, schrieb Pierre de Buyl
<pdebuyl@...>:

> Le 15 sept. 2011 à 18:09, Felix Höfling a écrit :
>> Am 14.09.2011, 09:39 Uhr, schrieb Pierre de Buyl  
>> <pdebuyl@...>:
>>> Hi everyone,
>>>
>>> There has been a lot of traffic on the list and I couldn't keep up.  
>>> I'll try to make a synthesis of my suggestions, remarks and questions.
>>>
>>> Please, if possible, keep the discussion in this single thread so as  
>>> to remain focused on these items.
>>>
>>> 1. parameters "subgroups"
>>>      - I think it is a good idea to devise a few (let us keep it to a  
>>> minimum) shared parameters and to put program parameters in  
>>> "/parameters/program_name".
>>>      - In my opinion, time-dependent information should not be in  
>>> "parameters" but in "observables".
>>>
>> The idea behind moving 'particle_number' and 'box' to /parameters was  
>> that they should become mandatory for a H5MD file, while the  
>> /observables group is optional. I have encountered several situations  
>> where both information is essential for the further interpretion of the  
>> data, independent of where the data come from (trajectory or  
>> observables). Of course, they could be time-dependent datasets (as we  
>> have in both trajectory and observables so far).
>
> Well, "observables" is, in my view, all that changes during the  
(Continue reading)

Pierre de Buyl | 10 Oct 12:04 2011
Picon
Picon

Re: tentative of synthesis: parameters, box size, particle number


Le 7 oct. 2011 à 15:38, Felix Höfling a écrit :

> Am 07.10.2011, 08:46 Uhr, schrieb Pierre de Buyl
> <pdebuyl@...>:
>
>> Le 15 sept. 2011 à 18:09, Felix Höfling a écrit :
>>> Am 14.09.2011, 09:39 Uhr, schrieb Pierre de Buyl  
>>> <pdebuyl@...>:
>>>> Please, if possible, keep the discussion in this single thread so as  
>>>> to remain focused on these items.
>>>> 1. parameters "subgroups"
>> Well, "observables" is, in my view, all that changes during the  
>> simulation, which is the case of the box size. The box volume, for  
>> instance, is a proper physical observable.
>>
> Sure (although a fixed box size would be a bit boring).
>
>>>> 2. box size
>> A file should always, in my opinion, contain the observables group.
>> I go on after point 3.
>>
> This is to be discussed. I prefer a separate paramters group which is
> mandatory, the trajectory or observables group may be present or not,
> independently of each other. A H5MD file may also used as input of a
> simulation, then the trajectory group makes perfectly sense while
> 'observables' should contain the outcome of the simulation. And there are
> other parameters like space dimension that would not fit well into the
> observables group.
> Konrad and Peter, what do you think?
(Continue reading)

Konrad Hinsen | 12 Oct 15:31 2011
Picon

Re: tentative of synthesis: parameters, box size, particle number

On 5 Oct, 2011, at 10:16 , Felix Höfling wrote:

> I think the above scheme naturally includes triclinic boxes. What is then
> the benefit of the attribute "kind"? If it is set to "cubic", one could
> use scalars instead of matrices for the edge value. But such an approach
> could easily result in an endless distinction of cases for H5MD readers.
> On the other hand, knowing that the box is cuboid/orthorhombic would
> simplify the computation of, e.g., the box volume. (In the present scheme,
> one needs to compute the determinant of the edges matrix—which is not hard
> if an algorithm like numpy.linalg.det is available.)

For me the important difference between a cubic and a triclinic box is that a cubic box is guaranteed to be
cubic at all times, whereas a general triclinic box may happen to be cubic at some instant and then change.
That's why it is not sufficient to say "store the general lattice vectors for a triclinic box, and let the
reader check the geometry to see if it is cubic". The reader would have to do that check for all time steps. So
yes, I consider it important to store information about "cubicity" somehow, even if the box size and shape
is then always stored fully (three lattice vectors).

> If we find it really necessary to provide additional meta information as
> box kind or time dependence, I would prefer a Boolean scheme that allows
> to combines these features independently (time_dependence: true/false,
> orthogonal: true/false, internal_symmetries: true/false).

Some things are naturally boolean (e.g. time dependence), others aren't (e.g. box shape). I'd prefer not
to be dogmatic about how metadata is stored.

On 6 Oct, 2011, at 11:13 , Pierre de Buyl wrote:

> Ok, I realized only recently (this week) that the set of symmetry
> transformations is not to be applied to the coordinates but that
(Continue reading)

Peter Colberg | 12 Oct 23:42 2011
Picon
Picon

Re: tentative of synthesis: parameters, box size, particle number

Dear all,

After catching up with 20(!) mails on box geometry and symmetry
transformations for periodic images, I am ready to comment. But
before that, a belated welcome to the discussion group, Konrad!

On Wed, Oct 12, 2011 at 03:31:52PM +0200, Konrad Hinsen wrote:
> On 5 Oct, 2011, at 10:16 , Felix Höfling wrote:
> 
> > I think the above scheme naturally includes triclinic boxes. What is then
> > the benefit of the attribute "kind"? If it is set to "cubic", one could
> > use scalars instead of matrices for the edge value. But such an approach
> > could easily result in an endless distinction of cases for H5MD readers.
> > On the other hand, knowing that the box is cuboid/orthorhombic would
> > simplify the computation of, e.g., the box volume. (In the present scheme,
> > one needs to compute the determinant of the edges matrix—which is not hard
> > if an algorithm like numpy.linalg.det is available.)
> 
> For me the important difference between a cubic and a triclinic box
> is that a cubic box is guaranteed to be cubic at all times, whereas
> a general triclinic box may happen to be cubic at some instant and
> then change. That's why it is not sufficient to say "store the
> general lattice vectors for a triclinic box, and let the reader
> check the geometry to see if it is cubic". The reader would have to
> do that check for all time steps. So yes, I consider it important to
> store information about "cubicity" somehow, even if the box size and
> shape is then always stored fully (three lattice vectors).

I agree that it is impractical to store lattice vectors and demand
from a reader to detect a cubic box by checking for zero components,
(Continue reading)

Peter Colberg | 12 Oct 23:48 2011
Picon
Picon

Re: tentative of synthesis: parameters, box size, particle number

On Wed, Oct 12, 2011 at 05:42:56PM -0400, Peter Colberg wrote:
> Please, it would be nice to see the follow-up discussion of details in
> the form of git commits against h5md HEAD, in the spirit of “show me
> the code” ;-).

Just to make sure: I did *not* mean git commits *pushed* to the
repository, but sent to the list for review using git format-patch
or related tools.

Peter


Gmane