Stathis Kamperis | 22 Mar 2009 17:00
Picon

summer of code - scrub feature

Greetings everyone,

I hereby express my interest in the implementation of the "scrub data
blocks before deletion" project, as part of the summer of code.
I am going to investigate the topic a bit more and write my
application in the next couple of days.

Any feedback, whatsoever, would be highly appreciated at this point.

Best regards,
Stathis Kamperis

Thor Lancelot Simon | 22 Mar 2009 21:44

Re: summer of code - scrub feature

On Sun, Mar 22, 2009 at 06:00:51PM +0200, Stathis Kamperis wrote:
> Greetings everyone,
> 
> I hereby express my interest in the implementation of the "scrub data
> blocks before deletion" project, as part of the summer of code.
> I am going to investigate the topic a bit more and write my
> application in the next couple of days.

What does this feature get us that rm -P doesn't already?

Thor

Alistair Crooks | 23 Mar 2009 00:21

Re: summer of code - scrub feature

On Sun, Mar 22, 2009 at 04:44:28PM -0400, Thor Lancelot Simon wrote:
> On Sun, Mar 22, 2009 at 06:00:51PM +0200, Stathis Kamperis wrote:
> > Greetings everyone,
> > 
> > I hereby express my interest in the implementation of the "scrub data
> > blocks before deletion" project, as part of the summer of code.
> > I am going to investigate the topic a bit more and write my
> > application in the next couple of days.
> 
> What does this feature get us that rm -P doesn't already?

Nothing. Well, nothing that a decent implementation of rm -P wouldn't
cure. And no, I don't believe the study recently that said that a
single overwrite left no traces of previous data, but I'm kinda funny
that way.

Yes, and there's the ability to have f?truncate(2) catered for by
overwriting with random gibberish.

Oh, and not having to remember to specify -P to rm for any file that
you want to make sure is overwritten, because it's too late once those
data blocks make it back onto the freelist. I suppose you could take the
hit and alias rm -P, and wait while directory trees take ages to delete.
We're all too busy anyway, these days, and could do with a break.

Then there's the fact that rm isn't the only way to issue an unlink(2)
or truncate(2) or ftruncate(2) system call - I mean, we very rarely use
rm -rf, or tar --unlink, or 'cat > file', 'cc -o file' or any other
means.

(Continue reading)

Thor Lancelot Simon | 23 Mar 2009 01:07

Re: summer of code - scrub feature

On Sun, Mar 22, 2009 at 11:21:45PM +0000, Alistair Crooks wrote:
> 
> Yes, and there's the ability to have f?truncate(2) catered for by
> overwriting with random gibberish.
> 
> Oh, and not having to remember to specify -P to rm for any file that
> you want to make sure is overwritten, because it's too late once those
> data blocks make it back onto the freelist. I suppose you could take the
> hit and alias rm -P, and wait while directory trees take ages to delete.
> We're all too busy anyway, these days, and could do with a break.

There's another problem -- spared sectors.  There is a US Government
standard for erasing files and disks, which used to specify procedures
for securely erasing individual files.  That portion of the standard
was rescinded: for the government's purposes, anyway, only whole-disk
erase will suffice, and if the disk will not allow spared-out sectors
to be overwritten with the mandated erase patterns, even that is not
enough.

This is why I stopped improving rm -P after I read the _current_ 
version of the standard.  It is probably good enough, now, for most
people's purposes -- but it is _not_ good enough for the purposes of
those who care enough to do what the relevant standard says; and it
can't be.

The other problem is that the WAPBL journal will have pieces of file
data in it.  Aggressively overwriting the log after transactions have
been committed will _murder_ performance.  The best solution to all this
is to just use cgd!

(Continue reading)

Alistair Crooks | 23 Mar 2009 02:13

Re: summer of code - scrub feature

On Sun, Mar 22, 2009 at 08:07:06PM -0400, Thor Lancelot Simon wrote:
> On Sun, Mar 22, 2009 at 11:21:45PM +0000, Alistair Crooks wrote:
> > 
> > Yes, and there's the ability to have f?truncate(2) catered for by
> > overwriting with random gibberish.
> > 
> > Oh, and not having to remember to specify -P to rm for any file that
> > you want to make sure is overwritten, because it's too late once those
> > data blocks make it back onto the freelist. I suppose you could take the
> > hit and alias rm -P, and wait while directory trees take ages to delete.
> > We're all too busy anyway, these days, and could do with a break.
> 
> There's another problem -- spared sectors.  There is a US Government
> standard for erasing files and disks, which used to specify procedures
> for securely erasing individual files.  That portion of the standard
> was rescinded: for the government's purposes, anyway, only whole-disk
> erase will suffice, and if the disk will not allow spared-out sectors
> to be overwritten with the mandated erase patterns, even that is not
> enough.
> 
> This is why I stopped improving rm -P after I read the _current_ 
> version of the standard.  It is probably good enough, now, for most
> people's purposes -- but it is _not_ good enough for the purposes of
> those who care enough to do what the relevant standard says; and it
> can't be.

Yeah, sector sparing is a problem that is not meant to be addressed
by this project. It's another one in itself, and is highly dependent
on the disk itself, since it's the disk that's changing the rules
underneath the OS's feet. Whcih isn't to say that the project couldn't
(Continue reading)

Thor Lancelot Simon | 23 Mar 2009 02:20

Re: summer of code - scrub feature

On Mon, Mar 23, 2009 at 01:13:16AM +0000, Alistair Crooks wrote:
> On Sun, Mar 22, 2009 at 08:07:06PM -0400, Thor Lancelot Simon wrote:
>  
> > The other problem is that the WAPBL journal will have pieces of file
> > data in it.  Aggressively overwriting the log after transactions have
> > been committed will _murder_ performance.  The best solution to all this
> > is to just use cgd!
> 
> Requiring the use of cgd is certainly one way of addressing the
> problem.  Another way would be to give up using computers completely
> since we can never be sure that data will not be copied without our
> prior knowledge and sent back to the HQ of the cabal's mothership.

Let me try to state the actual performance constraints as I understand
them, more succinctly:

	1) Overwrite-on-erase for every erase will cost more than
	   encrypt-on-write for every write, on almost any modern
	   platform and for almost any workload.

	2) WAPBL makes this even more so.  Much more so.

	3) We expect that almost all users will use WAPBL for almost
	   all workloads.

This suggests to me that cgd is always a better solution to the problem
we are trying to solve here, that data might be read after files are
erased.

Morever, cgd ensures that you don't even have to worry about the journal
(Continue reading)

Jan Schaumann | 23 Mar 2009 03:06
Favicon
Gravatar

Re: summer of code - scrub feature

Alistair Crooks <agc <at> pkgsrc.org> wrote:

> Sounds like an excellent idea to me - I don't know if the list of
> projects has been closed or what, but I'm sure one of the SoC admins
> can help us out here.

The list of projects never closes.  However, for SoC, we do need not
only to have an idea for a project, but also somebody with the time,
ability and willingness (yes, all three) to mentor the project.

If both these components (project description + mentor) are there, just
add the project to the website (or mail www <at>  to do it for you).

-Jan
Alistair Crooks | 23 Mar 2009 03:26

Re: summer of code - scrub feature

On Sun, Mar 22, 2009 at 09:20:47PM -0400, Thor Lancelot Simon wrote:
> On Mon, Mar 23, 2009 at 01:13:16AM +0000, Alistair Crooks wrote:
> > On Sun, Mar 22, 2009 at 08:07:06PM -0400, Thor Lancelot Simon wrote:
> >  
> > > The other problem is that the WAPBL journal will have pieces of file
> > > data in it.  Aggressively overwriting the log after transactions have
> > > been committed will _murder_ performance.  The best solution to all this
> > > is to just use cgd!
> > 
> > Requiring the use of cgd is certainly one way of addressing the
> > problem.  Another way would be to give up using computers completely
> > since we can never be sure that data will not be copied without our
> > prior knowledge and sent back to the HQ of the cabal's mothership.
> 
> Let me try to state the actual performance constraints as I understand
> them, more succinctly:
> 
> 	1) Overwrite-on-erase for every erase will cost more than
> 	   encrypt-on-write for every write, on almost any modern
> 	   platform and for almost any workload.

A mount option would do it for the last unlink, and every truncate.
Per-file system and user flags are a bit finer-grained. For them,
no-one is talking about "every erase". Like I said before, if I
accidentally remove my .ssh or .gnupg directory, it's too late.
Having system and user scrub flags for the files on there means
taking the worry out of that kind of thing.

> 	2) WAPBL makes this even more so.  Much more so.

(Continue reading)

Thor Lancelot Simon | 23 Mar 2009 03:33

Re: summer of code - scrub feature

On Mon, Mar 23, 2009 at 02:26:40AM +0000, Alistair Crooks wrote:
> 
> If you're going down this route, you should also be encrypting any
> swap partitions, of course, using tempested hardware, and wearing tin
> foil on your head.  As ever, this is a question of what's possible,
> and of securing yourself as much as is economically and comfortably
> possible.

That's just silly -- and it goes nowhere to address my basic point,
which is that causing extra disk writes -- much less the painstakingly
flushed multiple overwrites that, for example, rm -P does -- today, is
much, much more expensive than just encrypting the entire volume and
being done with it.

I think it's a bad idea to waste effort on zeroizing erased data when
the same effort could be spent making it easier to do the _cheaper_ 
operation of just encrypting the data in the first place.  Jibes about
tinfoil hats are unhelpful, but make them if you like; I am done wasting
my time being spat on for talking common sense to the sky while it's
raining.

Thor

Christos Zoulas | 23 Mar 2009 03:57

Re: summer of code - scrub feature

In article <20090323012047.GA11291 <at> panix.com>,
Thor Lancelot Simon  <tls <at> rek.tjls.com> wrote:
>On Mon, Mar 23, 2009 at 01:13:16AM +0000, Alistair Crooks wrote:
>> On Sun, Mar 22, 2009 at 08:07:06PM -0400, Thor Lancelot Simon wrote:
>>  
>> > The other problem is that the WAPBL journal will have pieces of file
>> > data in it.  Aggressively overwriting the log after transactions have
>> > been committed will _murder_ performance.  The best solution to all this
>> > is to just use cgd!
>> 
>> Requiring the use of cgd is certainly one way of addressing the
>> problem.  Another way would be to give up using computers completely
>> since we can never be sure that data will not be copied without our
>> prior knowledge and sent back to the HQ of the cabal's mothership.
>
>Let me try to state the actual performance constraints as I understand
>them, more succinctly:
>
>	1) Overwrite-on-erase for every erase will cost more than
>	   encrypt-on-write for every write, on almost any modern
>	   platform and for almost any workload.
>
>	2) WAPBL makes this even more so.  Much more so.
>
>	3) We expect that almost all users will use WAPBL for almost
>	   all workloads.
>
>This suggests to me that cgd is always a better solution to the problem
>we are trying to solve here, that data might be read after files are
>erased.
(Continue reading)


Gmane