Ramki_Krishnan | 27 Jul 15:51 2015
Picon

measuring energy consumption for NFV

One of the topics which came up during the Prague meeting while discussing Al’s draft on benchmarking VNFs was measuring energy consumption. Please find details below. More details are also in the NFVRG draft - https://datatracker.ietf.org/doc/draft-krishnan-nfvrg-policy-based-rm-nfviaas/?include_text=1.

 

At the physical server level, instantaneous energy

   consumption can be accurately measured through IPMI standard. At a

   VM level, instantaneous energy consumption can be

   approximately measured using an overall utilization metric, which is

   a combination of CPU utilization, memory usage, I/O usage, and

   network usage.

Thanks,

Ramki

 

_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
MORTON, ALFRED C (AL | 27 Jul 14:30 2015
Picon

Re: draft-ietf-bmwg-virtual-net / CPU & memory utilization should be test configurations or test results?

Hi Saurabh,

 

Thanks for your question and for continuing this discussion.

 

In my mind, any server-oriented measurement that informs us

about resources consumed would be a fair result to collect.

But as we said during the meeting, Only as an Auxilliary metric

while other benchmarking is in-progress.

 

One of the challenges I mentioned adding to my draft is for

benchmarking metrics to assist deployment designers, or perhaps

to provide input for some form of resource model so that your

question about adding VNFs below might be answered.

 

We have a draft that begins to address the resource sharing

aspect of benchmarking here:

https://tools.ietf.org/html/draft-vsperf-bmwg-vswitch-opnfv-00

prepared by the OPNFV vsperf project, and Maryam Tahhan presented

additional material on this topic in slides (see the IETF-93 materails).

 

Sorry for the brief response and the delay responding, I’ve

been travelling all weekend and just arrived at another meeting. L

 

regards,

Al

 

From: Saurabh Chattopadhyay - ERS, HCL Tech [mailto:saurabhchattopadhya <at> hcl.com]
Sent: Friday, July 24, 2015 5:58 PM
To: draft-ietf-bmwg-virtual-net <at> tools.ietf.org
Cc: bmwg <at> ietf.org
Subject: draft-ietf-bmwg-virtual-net / CPU & memory utilization should be test configurations or test results?

 

Hello Al,

 

First, thank you for writing this draft. It is an excellent guideline for us who are working in VNF Benchmarking area.

 

In the BMWG meeting session yesterday, there were some interesting discussions around whether to consider the CPU / memory utilization (and similar parameters) as test configurations or test results. I think we couldn’t discuss this in detail due to time constrain, thus I thought of bringing this up in the list to have your and other experts’ views.

 

My own understanding is, for VNF benchmarking specifically, this issue becomes little delicate. Considering the fact that VNF Black-box is reliant on certain soft integration and soft partitioning of the underlying hardware, all benchmark results are dependent on the load imposed over the entire hardware, in addition to be dependent on the load directed towards the VNF specifically. For example, if a VNF is masked over four core, and the particular hardware provides 12 more core, VNF’s response on a fixed load condition will vary while the other cores are put under different load conditions. Now at this point we can consider creating certain fixed type of load profiles (let’s say, a combination % of compute, storage and networking load) over remaining hardware, and benchmark the VNF under Test against its own load conditions as originally planned. However, during the real deployment, these fixed type of load profiles (combination % of compute, storage and networking load) don’t correlate well unless we qualify every VNF’s performance profiles against these parameters. So this essentially means that even though a benchmark data is produced for the VNF for the particular type of hardware-part and for certain load conditions over remaining hardware, the deployment folks are not clear on how to leverage this intelligence especially when they are planning to deploy the other VNFs in the remaining hardware.  

 

I’m not sure what will be an appropriate way to solve this. Measuring CPU / memory utilization (and similar parameters for the shared assets) may be an option, but not sure if this truly aligns with the black-box benchmarking methodologies. Kindly advice.

 

Warm Regards,

Saurabh

 



::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
GEORGESCU LIVIU MARIUS | 25 Jul 12:31 2015
Picon

Re: Question about using the IPv6 benchmarking address space

The amended prefix is within the specifications of RFC5180. But again, I don't know if it's really necessary to change the prefix. I am not sure, but my guess is that IANA allocated the prefix to prevent any of the benchmarking traffic reaching the Internet. That being said, I don't see any problem with using 2001:2:a:bb1e::/64 in an isolated test environment.
However, if respecting the recommendations of RFC5180 is desired, you should also consider the following note in RFC5180:
" Note: Similar to RFC 2544 avoiding the use of RFC 1918 address space for benchmarking tests, this document does not recommend the use of RFC 4193 [4] (Unique Local Addresses) in order to minimize the possibility of conflicts with operational traffic."
This is relevant for the discussion in v6ops about using ULA, which is not recommended by RFC5180.
Marius
On 07/25/15, David Schinazi <dschinazi <at> apple.com> wrote:

Thanks for clarifying Marius, I must have misread /32 instead of /48.

In that case, let me amend my question to use a prefix like this instead:
2001:2:0:aab1/64
(named after our NASDAQ symbol AAPL)

Thanks,
David


On Jul 25, 2015, at 11:25, GEORGESCU LIVIU MARIUS <liviumarius-g <at> is.naist.jp> wrote:

Just to clarify, the IPv6 prefix mentioned in RFC5180(+errata) is 2001:2::/48 and in my understanding does not include the the prefix 2001:2:a:bb1e::/64. However, I don't think there's any issue with using it in the context of an isolated testing environment. 

Marius

On 07/24/15, David Schinazi <dschinazi <at> apple.com> wrote:
Hi bmwg,

We would like to use addresses from the IPv6 benchmarking prefix on an independent test network and wonder if there are issues
we have not thought of.

A little background on what we're doing: in OS X El Capitan we introduced a NAT64 mode for the Mac's Internet Sharing feature
to allow iOS developers to test their applications for IPv6 support. Using this, you can share your IPv4 internet connectivity
(e.g. from ethernet) to a newly created Wi-Fi network that only supports IPv6, and the Mac will perform NAT64+DNS64.
Currently the internal addresses of that network are using the Teredo prefix (2001::/64) and we have been advised to use
something that is not treated differently by RFC 6724 (Default Address Selection for IPv6).
We thought that since this is a testing network and that those addresses never leave that link, using the benchmarking prefix
from RFC5180(+ errata) (2001:2/32) would be reasonable. We thought that using 2001:2:a:bb1e/64 as our network prefix
would be appropriate, as "a:bb1e" looks a bit like "apple". Note that this is the prefix advertised by RA to the Wi-Fi network,
it is not the prefix of the NAT64 translation.

Does anyone think there could be any issues with this?

Thanks,
David Schinazi
Apple CoreOS Networking Engineer


_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg

_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
GEORGESCU LIVIU MARIUS | 25 Jul 11:25 2015
Picon

Re: Question about using the IPv6 benchmarking address space

Just to clarify, the IPv6 prefix mentioned in RFC5180(+errata) is 2001:2::/48 and in my understanding does not include the the prefix 2001:2:a:bb1e::/64. However, I don't think there's any issue with using it in the context of an isolated testing environment. 


Marius

On 07/24/15, David Schinazi <dschinazi <at> apple.com> wrote:
Hi bmwg,

We would like to use addresses from the IPv6 benchmarking prefix on an independent test network and wonder if there are issues
we have not thought of.

A little background on what we're doing: in OS X El Capitan we introduced a NAT64 mode for the Mac's Internet Sharing feature
to allow iOS developers to test their applications for IPv6 support. Using this, you can share your IPv4 internet connectivity
(e.g. from ethernet) to a newly created Wi-Fi network that only supports IPv6, and the Mac will perform NAT64+DNS64.
Currently the internal addresses of that network are using the Teredo prefix (2001::/64) and we have been advised to use
something that is not treated differently by RFC 6724 (Default Address Selection for IPv6).
We thought that since this is a testing network and that those addresses never leave that link, using the benchmarking prefix
from RFC5180(+ errata) (2001:2/32) would be reasonable. We thought that using 2001:2:a:bb1e/64 as our network prefix
would be appropriate, as "a:bb1e" looks a bit like "apple". Note that this is the prefix advertised by RA to the Wi-Fi network,
it is not the prefix of the NAT64 translation.

Does anyone think there could be any issues with this?

Thanks,
David Schinazi
Apple CoreOS Networking Engineer


_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg

Re: draft-ietf-bmwg-virtual-net / CPU & memory utilization should be test configurations or test results..

Hello Al,

 

First, thank you for writing this draft. It is an excellent guideline for us who are working in VNF Benchmarking area.

 

In the BMWG meeting session yesterday, there were some interesting discussions around whether to consider the CPU / memory utilization (and similar parameters) as test configurations or test results. I think we couldn’t discuss this in detail due to time constrain, thus I thought of bringing this up in the list to have your and other experts’ views.

 

My own understanding is, for VNF benchmarking specifically, this issue becomes little delicate. Considering the fact that VNF Black-box is reliant on certain soft integration and soft partitioning of the underlying hardware, all benchmark results are dependent on the load imposed over the entire hardware, in addition to be dependent on the load directed towards the VNF specifically. For example, if a VNF is masked over four core, and the particular hardware provides 12 more core, VNF’s response on a fixed load condition will vary while the other cores are put under different load conditions. Now at this point we can consider creating certain fixed type of load profiles (let’s say, a combination % of compute, storage and networking load) over remaining hardware, and benchmark the VNF under Test against its own load conditions as originally planned. However, during the real deployment, these fixed type of load profiles (combination % of compute, storage and networking load) don’t correlate well unless we qualify every VNF’s performance profiles against these parameters. So this essentially means that even though a benchmark data is produced for the VNF for the particular type of hardware-part and for certain load conditions over remaining hardware, the deployment folks are not clear on how to leverage this intelligence especially when they are planning to deploy the other VNFs in the remaining hardware.  

 

I’m not sure what will be an appropriate way to solve this. Measuring CPU / memory utilization (and similar parameters for the shared assets) may be an option, but not sure if this truly aligns with the black-box benchmarking methodologies. Kindly advice.

 

Warm Regards,

Saurabh

 



::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
David Schinazi | 24 Jul 22:59 2015
Picon

Question about using the IPv6 benchmarking address space

Hi bmwg,

We would like to use addresses from the IPv6 benchmarking prefix on an independent test network and wonder
if there are issues
we have not thought of.

A little background on what we're doing: in OS X El Capitan we introduced a NAT64 mode for the Mac's Internet
Sharing feature
to allow iOS developers to test their applications for IPv6 support. Using this, you can share your IPv4
internet connectivity
(e.g. from ethernet) to a newly created Wi-Fi network that only supports IPv6, and the Mac will perform NAT64+DNS64.
Currently the internal addresses of that network are using the Teredo prefix (2001::/64) and we have been
advised to use
something that is not treated differently by RFC 6724 (Default Address Selection for IPv6).
We thought that since this is a testing network and that those addresses never leave that link, using the
benchmarking prefix
from RFC5180(+ errata) (2001:2/32) would be reasonable. We thought that using 2001:2:a:bb1e/64 as our
network prefix
would be appropriate, as "a:bb1e" looks a bit like "apple". Note that this is the prefix advertised by RA to
the Wi-Fi network,
it is not the prefix of the NAT64 translation.

Does anyone think there could be any issues with this?

Thanks,
David Schinazi
Apple CoreOS Networking Engineer
Bradner, Scott | 23 Jul 14:16 2015
Picon

draft-ietf-bmwg-bgp-basic-convergence-05

This ID came up during the BMWG session today

Al mentioned that it had been in the rfc editor queue with a reference issue for a long time

the missing reference is ietf-sidr-bgpsec-protocol - it is only referenced in section 4.8

4.8.  Authentication


   Authentication in BGP is done using the TCP MD5 Signature Option
   [RFC5925].  The processing of the MD5 hash, particularly in devices
   with a large number of BGP peers and a large amount of update
   traffic, can have an impact on the control plane of the device.  If
   authentication is enabled, it MUST be documented correctly in the
   reporting format.

   Also it is recommended that trials MUST be with the same SIDR
   features (RFC7115 & BGPSec).  The best convergence tests would be
   with No SIDR features, and then with the same SIDR features.

two things about this reference

1/ the reference is listed as normative but it does not seem to me to be so in the overall context of the ID 

moving it to informational would seem to be fine

2/ title of RFC 5925 is "The TCP Authentication Option” (the title in the text of the obsoleted RFC)

Scott


_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
IETF Secretariat | 23 Jul 13:13 2015
Picon

Milestones changed for bmwg WG

Changed milestone "Basic BGP Convergence Benchmarking Methodology to
IESG Review", resolved as "Done".

Changed milestone "Draft on Traffic Management Benchmarking to IESG
Review", resolved as "Done".

Changed milestone "Draft on In-Service Software Upgrade Benchmarking
to IESG Review", resolved as "Done".

URL: https://datatracker.ietf.org/wg/bmwg/charter/
MORTON, ALFRED C (AL | 23 Jul 01:29 2015
Picon

Slides and remote participation

All the slides we've received have been posted,
please send the missing decks ASAP.

Remote participation details are here, find us on
ch8, Karlin III.
http://www.ietf.org/meeting/93/remote-participation.html#audio

regards,
Al
bmwg co-chair
Bhuvan (Veryx Technologies | 21 Jul 13:22 2015

Benchmarking Methodology for SDN Controller Performance - Updated Draft Version

Dear BMWG Members,

 

We have updated the draft (draft-bhuvan-bmwg-sdn-controller-benchmark-meth-00) about SDN Controller benchmarking addressing comments received in IETF-92 meeting. Thank you very much for providing your valuable comments. The latest draft can be found in draft-bhuvan-bmwg-sdn-controller-benchmark-meth-01

 

Summary of Changes:

a.       Updated test setup diagram following the comment from Scott Bradner.

b.      Added recommendations for test topology, test iterations etc., to use for benchmarking.

c.       Provided reference test topologies.

d.      Split Path Provisioning tests into two different tests - Proactive and Reactive Path Provisioning tests.

e.      Provided more clarity on test procedure for some of the tests.

f.        Fixed IETF normative language usage.

 

We would love to hear any comments and queries on the same.

 

Thanks,

Authors

_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
GEORGESCU LIVIU MARIUS | 18 Jul 16:40 2015
Picon

Re: I-D Update

Hello Al,


If it's just spelling, it's satisfying enough in my opinion. 
I noticed you are using it in other contexts as well (e.g. VNF draft) and was wondering if it was supposed to have a different meaning than "scalability".
In the spirit of a draft-a-week, here are my comments for the draft https://www.ietf.org/id/draft-ietf-bmwg-virtual-net-00.txt.
Please find attached a txt version. I marked my comments with ###MG .

Best regards,
Marius

On 07/10/15, "MORTON, ALFRED C (AL)" <acmorton <at> att.com> wrote:
Hi Marius,
I typed "scaleability" late last night, and spell check didn't object, so I kept going...
simple explanation, but not satisfying,
Al
________________________________________

From: GEORGESCU LIVIU MARIUS [liviumarius-g <at> is.naist.jp]
Sent: Thursday, July 09, 2015 12:24 PM
To: MORTON, ALFRED C (AL); bmwg <at> ietf.org
Subject: RE: [bmwg] I-D Update

Dear Al,

Thank you very much for the very detailed review. All the comments are relevant and I will attempt to answer them in the next version.
One question though, why do you use the spelling "scaleability" ?

Best regards,
Marius

On 07/09/15, "MORTON, ALFRED C (AL)" <acmorton <at> att.com> wrote:
Hi Marius,

Attached, please find a few suggestions for your draft.
search for ACM:

Thanks for seeking and implementing the many comments so far!
regards,
Al
(as a participant)

From: bmwg [mailto:bmwg-bounces <at> ietf.org] On Behalf Of Marius Georgescu
Sent: Thursday, July 02, 2015 6:37 AM
To: bmwg <at> ietf.org
Subject: [bmwg] I-D Update

Dear BMWG Members,

I hope my e-mail finds you well. Please find below a link to the latest update of the draft I’ve been working on:
https://tools.ietf.org/html/draft-georgescu-bmwg-ipv6-tran-tech-benchmarking-01

The biggest change and probably the most debatable is the addition of the Delay variation (jitter) subsection under Benchmarking tests (Section 6.3), following the comments received during the IETF92 meeting.  I would like to take this opportunity to thank all the people that supported and provided feedback for this draft so far and to ask for more ☺.

Best regards,
Marius







Network Working Group                                          A. Morton
Internet-Draft                                                 AT&T Labs
Intended status: Informational                              May 31, 2015
Expires: December 2, 2015


  Considerations for Benchmarking Virtual Network Functions and Their
                             Infrastructure
                     draft-ietf-bmwg-virtual-net-00

Abstract

   Benchmarking Methodology Working Group has traditionally conducted
   laboratory characterization of dedicated physical implementations of
   internetworking functions.  This memo investigates additional
   considerations when network functions are virtualized and performed
   in commodity off-the-shelf hardware.

   Version NOTES:

   Addressed Barry Constantine's comments throughout the draft, see:

   http://www.ietf.org/mail-archive/web/bmwg/current/msg03167.html


   AND, comments from the extended discussion during IETF-92 BMWG
   session:

   1 & 2: General Purpose HW and why we care to a greater degree about
   "what's in the black box" in this benchmarking context.

   3: System under Test description = platform and VNFs and...

   4.1 Scale and capacity benchmarks still needed.

   4.4 Compromise on appearance of capacity and the 3x3 Matrix

   new 4.5, Power consumption

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].








Morton                  Expires December 2, 2015                [Page 1]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.


   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 2, 2015.

Copyright Notice

   Copyright (c) 2015 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Considerations for Hardware and Testing . . . . . . . . . . .   5
     3.1.  Hardware Components . . . . . . . . . . . . . . . . . . .   5
     3.2.  Configuration Parameters  . . . . . . . . . . . . . . . .   5
     3.3.  Testing Strategies  . . . . . . . . . . . . . . . . . . .   6
     3.4.  Attention to Shared Resources . . . . . . . . . . . . . .   7
   4.  Benchmarking Considerations . . . . . . . . . . . . . . . . .   7
     4.1.  Comparison with Physical Network Functions  . . . . . . .   7
     4.2.  Continued Emphasis on Black-Box Benchmarks  . . . . . . .   8
     4.3.  New Benchmarks and Related Metrics  . . . . . . . . . . .   8
     4.4.  Assessment of Benchmark Coverage  . . . . . . . . . . . .   9
     4.5.  Power Consumption . . . . . . . . . . . . . . . . . . . .  11
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  12



Morton                  Expires December 2, 2015                [Page 2]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  12
   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  12
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  12
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  12
     8.2.  Informative References  . . . . . . . . . . . . . . . . .  14
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  14

1.  Introduction

   Benchmarking Methodology Working Group (BMWG) has traditionally
   conducted laboratory characterization of dedicated physical
   implementations of internetworking functions (or physical network
   functions, PNFs).  The Black-box Benchmarks of Throughput, Latency,
   Forwarding Rates and others have served our industry for many years.
   [RFC1242] and [RFC2544] are the cornerstones of the work.

   An emerging set of service provider and vendor development goals is
   to reduce costs while increasing flexibility of network devices, and
   drastically accelerate their deployment.  Network Function
   Virtualization (NFV) has the promise to achieve these goals, and
   therefore has garnered much attention.  It now seems certain that
   some network functions will be virtualized following the success of
   cloud computing and virtual desktops supported by sufficient network
   path capacity, performance, and widespread deployment; many of the
   same techniques will help achieve NFV.

   In the context of Virtualized Network Functions (VNF), the supporting
   Infrastructure requires general-purpose computing systems, storage
   systems, networking systems, virtualization support systems (such as
   hypervisors), and management systems for the virtual and physical
   resources.  There will be many potential suppliers of Infrastructure
   systems and significant flexibility in configuring the systems for
   best performance.  There are also many potential suppliers of VNFs,
   adding to the combinations possible in this environment.  The
   separation of hardware and software suppliers has a profound
   implication on benchmarking activities: much more of the internal
   configuration of the black-box device under test (DUT) must now be
   specified and reported with the results, to foster both repeatability
   and comparison testing at a later time.

   Consider the following User Story as further background and
   motivation:

   "I'm designing and building my NFV Infrastructure platform.  The
   first steps were easy because I had a small number of categories of
   VNFs to support and the VNF vendor gave HW recommendations that I
   followed.  Now I need to deploy more VNFs from new vendors, and there
   are different hardware recommendations.  How well will the new VNFs



Morton                  Expires December 2, 2015                [Page 3]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


   perform on my existing hardware?  Which among several new VNFs in a
   given category are most efficient in terms of capacity they deliver?
   And, when I operate multiple categories of VNFs (and PNFs)
   *concurrently* on a hardware platform such that they share resources,
   what are the new performance limits, and what are the software design
   choices I can make to optimize my chosen hardware platform?
   Conversely, what hardware platform upgrades should I pursue to
   increase the capacity of these concurrently operating VNFs?"

   See http://www.etsi.org/technologies-clusters/technologies/nfv for
   more background, for example, the white papers there may be a useful
   starting place.  The Performance and Portability Best Practices
   [NFV.PER001] are particularly relevant to BMWG.  There are documents
   available in the Open Area http://docbox.etsi.org/ISG/NFV/Open/

   Latest_Drafts/ including drafts describing Infrastructure aspects and
   service quality.

2.  Scope

   BMWG will consider the new topic of Virtual Network Functions and
   related Infrastructure to ensure that common issues are recognized
   from the start, using background materials from industry and SDOs
   (e.g., IETF, ETSI NFV).

   This memo investigates additional methodological considerations
   necessary when benchmarking VNFs instantiated and hosted in general-
   purpose hardware, using bare-metal hypervisors or other isolation
   environments such as Linux containers.  An essential consideration is
   benchmarking physical and virtual network functions in the same way
   when possible, thereby allowing direct comparison.  Also,
   benchmarking combinations of physical and virtual devices and
   functions in a System Under Test.

   A clearly related goal: the benchmarks for the capacity of a general-
   purpose platform to host a plurality of VNF instances should be
   investigated.  Existing networking technology benchmarks will also be
   considered for adaptation to NFV and closely associated technologies.

   A non-goal is any overlap with traditional computer benchmark
   development and their specific metrics (SPECmark suites such as
   SPECCPU).

   A colossal non-goal is any form of architecture development related
   to NFV and associated technologies in BMWG, consistent with all
   chartered work since BMWG began in 1989.






Morton                  Expires December 2, 2015                [Page 4]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


3.  Considerations for Hardware and Testing

   This section lists the new considerations which must be addressed to
   benchmark VNF(s) and their supporting infrastructure.  The System
   Under Test (SUT) is composed of the hardware platform components, the
   VNFs installed, and many other supporting systems.  It is critical to
   document all aspects of the SUT to foster repeatability.

3.1.  Hardware Components

   New Hardware devices will become part of the test set-up.

   1.  High volume server platforms (general-purpose, possibly with
       virtual technology enhancements).

   2.  Storage systems with large capacity, high speed, and high
       reliability.

   3.  Network Interface ports specially designed for efficient service
       of many virtual NICs.

   4.  High capacity Ethernet Switches.

   Labs conducting comparisons of different VNFs may be able to use the
   same hardware platform over many studies, until the steady march of
   innovations overtakes their capabilities (as happens with the lab's
   traffic generation and testing devices today).

3.2.  Configuration Parameters

   It will be necessary to configure and document the settings for the
   entire general-purpose platform to ensure repeatability and foster
   future comparisons, including:

   o  number of server blades (shelf occupation)

   o  CPUs

   o  caches

   o  storage system

   o  I/O

   as well as configurations that support the devices which host the VNF
   itself:

   o  Hypervisor (or other forms of virtual function hosting)
   
   



Morton                  Expires December 2, 2015                [Page 5]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


   o  Virtual Machine (VM)

   o  Infrastructure Virtual Network (which interconnects Virtual
      Machines with physical network interfaces, or with each other
      through virtual switches, for example)

   and finally, the VNF itself, with items such as:

   o  specific function being implemented in VNF

   o  reserved resources for each function (e.g., CPU pinning)

   o  number of VNFs (or sub-VNF components, each with its own VM) in
      the service function chain (see section 1.1 of [RFC7498] for a
      definition of service function chain)

   o  number of physical interfaces and links transited in the service
      function chain

   ###MG: The draft could benefit from a Terminology Section/Subsection.  	  
	  
   In the physical device benchmarking context, most of the
   corresponding infrastructure configuration choices were determined by
   the vendor.  Although the platform itself is now one of the
   configuration variables, it is important to maintain emphasis on the
   networking benchmarks and capture the platform variables as input
   factors.

3.3.  Testing Strategies

   The concept of characterizing performance at capacity limits may
   change.  For example:

   1.  It may be more representative of system capacity to characterize
       the case where Virtual Machines (VM, hosting the VNF) are
       operating at 50% Utilization, and therefore sharing the "real"
       processing power across many VMs.

   2.  Another important case stems from the need for partitioning
       functions.  A noisy neighbor (VM hosting a VNF in an infinite
       loop) would ideally be isolated and the performance of other VMs
       would continue according to their specifications.

   3.  System errors will likely occur as transients, implying a
       distribution of performance characteristics with a long tail
       (like latency), leading to the need for longer-term tests of each
       set of configuration and test parameters.

   4.  The desire for elasticity and flexibility among network functions
       will include tests where there is constant flux in the number of



Morton                  Expires December 2, 2015                [Page 6]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


       VM instances.  Requests for and instantiation of new VMs, along
       with Releases for VMs hosting VNFs that are no longer needed
       would be an normal operational condition.  In other words,
       benchmarking should include scenarios with production life cycle
       management of VMs and their VNFs and network connectivity in-
       progress, as well as static configurations.

   5.  All physical things can fail, and benchmarking efforts can also
       examine recovery aided by the virtual architecture with different
       approaches to resiliency.

3.4.  Attention to Shared Resources

   Since many components of the new NFV Infrastructure are virtual, test
   set-up design must have prior knowledge of inter-actions/dependencies
   within the various resource domains in the System Under Test (SUT).
   For example, a virtual machine performing the role of a traditional
   tester function such as generating and/or receiving traffic should
   avoid sharing any SUT resources with the Device Under Test DUT.
   Otherwise, the results will have unexpected dependencies not
   encountered in physical device benchmarking.

   Note: The term "tester" has traditionally referred to devices
   dedicated to testing in BMWG literature.  In this new context,
   "tester" additionally refers to functions dedicated to testing, which
   may be either virtual or physical.  "Tester" has never referred to
   the individuals performing the tests.

   The shared-resource aspect of test design remains one of the critical
   challenges to overcome in a reasonable way to produce useful results.
   The physical test device remains a solid foundation to compare
   against results using combinations of physical and virtual test
   functions, or results using only virtual testers when necessary to
   assess virtual interfaces and other virtual functions.

4.  Benchmarking Considerations

   This section discusses considerations related to Benchmarks
   applicable to VNFs and their associated technologies.

4.1.  Comparison with Physical Network Functions

   In order to compare the performance of VNFs and system
   implementations with their physical counterparts, identical
   benchmarks must be used.  Since BMWG has already developed
   specifications for many network functions, there will be re-use of
   existing benchmarks through references, while allowing for the
   possibility of benchmark curation during development of new



Morton                  Expires December 2, 2015                [Page 7]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


   methodologies.  Consideration should be given to quantifying the
   number of parallel VNFs required to achieve comparable scale/capacity
   with a given physical device, or whether some limit of scale was
   reached before the VNFs could achieve the comparable level.  Again,
   implementation based-on different hypervisors or other virtual
   function hosting remain as critical factors in performance
   assessment.

4.2.  Continued Emphasis on Black-Box Benchmarks

   When the network functions under test are based on Open Source code,
   there may be a tendency to rely on internal measurements to some
   extent, especially when the externally-observable phenomena only
   support an inference of internal events (such as routing protocol
   convergence observed in the dataplane).  Examples include CPU/Core
   utilization and Memory Comitted/used.  However, external observations
   remain essential as the basis for Benchmarks.  Internal observations
   with fixed specification and interpretation may be provided in
   parallel, to assist the development of operations procedures when the
   technology is deployed, for example.  Internal metrics and
   measurements from Open Source implementations may be the only direct
   source of performance results in a desired dimension, but
   corroborating external observations are still required to assure the
   integrity of measurement discipline was maintained for all reported
   results.

   ###MG: Maybe RECOMMENDED(SHOULD) and  MAY should be used here for Black-Box vs White-Box (Grey-Box) benchmarks.
 
   A related aspect of benchmark development is where the scope includes
   multiple approaches to a common function under the same benchmark.
   For example, there are many ways to arrange for activation of a
   network path between interface points and the activation times can be
   compared if the start-to-stop activation interval has a generic and
   unambiguous definition.  Thus, generic benchmark definitions are
   preferred over technology/protocol specific definitions where
   possible.

4.3.  New Benchmarks and Related Metrics

   There will be new classes of benchmarks needed for network design and
   assistance when developing operational practices (possibly automated
   management and orchestration of deployment scale).  Examples follow
   in the paragraphs below, many of which are prompted by the goals of
   increased elasticity and flexibility of the network functions, along
   with accelerated deployment times.

   Time to deploy VNFs: In cases where the general-purpose hardware is
   already deployed and ready for service, it is valuable to know the
   response time when a management system is tasked with "standing-up"
   100's of virtual machines and the VNFs they will host.
   
   ###MG: I guess this is a generic classification of possible new metrics for VNFs. 
   Maybe a substructure (taxonomy/bullet list/Subsections) may help better organize the categories. 



Morton                  Expires December 2, 2015                [Page 8]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


   Time to migrate VNFs: In cases where a rack or shelf of hardware must
   be removed from active service, it is valuable to know the response
   time when a management system is tasked with "migrating" some number
   of virtual machines and the VNFs they currently host to alternate
   hardware that will remain in-service.

   Time to create a virtual network in the general-purpose
   infrastructure: This is a somewhat simplified version of existing
   benchmarks for convergence time, in that the process is initiated by
   a request from (centralized or distributed) control, rather than
   inferred from network events (link failure).  The successful response
   time would remain dependent on dataplane observations to confirm that
   the network is ready to perform.

   Also, it appears to be valuable to measure traditional packet
   transfer performance metrics during the assessment of traditional and
   new benchmarks, including metrics that may be used to support service
   engineering such as the Spatial Composition metrics found in
   [RFC6049].  Examples include Mean one-way delay in section 4.1 of
   [RFC6049], Packet Delay Variation (PDV) in [RFC5481], and Packet
   Reordering [RFC4737] [RFC4689].

4.4.  Assessment of Benchmark Coverage

   It can be useful to organize benchmarks according to their applicable
   life cycle stage and the performance criteria they intend to assess.
   The table below provides a way to organize benchmarks such that there
   is a clear indication of coverage for the intersection of life cycle
   stages and performance criteria.

   |----------------------------------------------------------|
   |               |             |            |               |
   |               |   SPEED     |  ACCURACY  |  RELIABILITY  |
   |               |             |            |               |
   |----------------------------------------------------------|
   |               |             |            |               |
   |  Activation   |             |            |               |
   |               |             |            |               |
   |----------------------------------------------------------|
   |               |             |            |               |
   |  Operation    |             |            |               |
   |               |             |            |               |
   |----------------------------------------------------------|
   |               |             |            |               |
   | De-activation |             |            |               |
   |               |             |            |               |
   |----------------------------------------------------------|




Morton                  Expires December 2, 2015                [Page 9]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


   For example, the "Time to deploy VNFs" benchmark described above
   would be placed in the intersection of Activation and Speed, making
   it clear that there are other potential performance criteria to
   benchmark, such as the "percentage of unsuccessful VM/VNF stand-ups"
   in a set of 100 attempts.  This example emphasizes that the
   Activation and De-activation life cycle stages are key areas for NFV
   and related infrastructure, and encourage expansion beyond
   traditional benchmarks for normal operation.  Thus, reviewing the
   benchmark coverage using this table (sometimes called the 3x3 matrix)
   can be a worthwhile exercise in BMWG.

   In one of the first applications of the 3x3 matrix on BMWG, we
   discovered that metrics on measured size, capacity, or scale do not
   easily match one of the three columns above.  Following discussion,
   this was resolved in two ways:

   o  Add a column, Scaleability, for use when categorizing benchmarks.

   o  If using the matrix to report results in an organized way, keep
      size, capacity, and scale metrics separate from the 3x3 matrix and
      incorporate them in the report with other qualifications of the
      results.
	###MG: I imagine it's hard to optimize the Synthetic/Detailed Trade-of for the Benchmark coverage matrix.
	However, I think Scalability is a very important aspect for VNFs and would go for the "Add a column" option.  
	  
   This approach encourages use of the 3x3 matrix to organize reports of
   results, where the capacity at which the various metrics were
   measured could be included in the title of the matrix (and results
   for multiple capacities would result in separate 3x3 matrices, if
   there were sufficient measurements/results to organize in that way).

   For example, results for each VM and VNF could appear in the 3x3
   matrix, organized to illustrate resource occupation (CPU Cores) in a
   particular physical computing system, as shown below.

###MG: I think a specific example on how to use the matrix could be useful.

















Morton                  Expires December 2, 2015               [Page 10]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


                 VNF#1
             .-----------.
             |__|__|__|__|
   Core 1    |__|__|__|__|
             |__|__|__|__|
             |  |  |  |  |
             '-----------'
                 VNF#2
             .-----------.
             |__|__|__|__|
   Cores 2-5 |__|__|__|__|
             |__|__|__|__|
             |  |  |  |  |
             '-----------'
                 VNF#3             VNF#4             VNF#5
             .-----------.    .-----------.     .-----------.
             |__|__|__|__|    |__|__|__|__|     |__|__|__|__|
   Core 6    |__|__|__|__|    |__|__|__|__|     |__|__|__|__|
             |__|__|__|__|    |__|__|__|__|     |__|__|__|__|
             |  |  |  |  |    |  |  |  |  |     |  |  |  |  |
             '-----------'    '-----------'     '-----------'
                  VNF#6
             .-----------.
             |__|__|__|__|
   Core 7    |__|__|__|__|
             |__|__|__|__|
             |  |  |  |  |
             '-----------'

   The combination of tables above could be built incrementally,
   beginning with VNF#1 and one Core, then adding VNFs according to
   their supporting core assignments.  X-Y plots of critical benchmarks
   would also provide insight to the effect of increased HW utilization.
   All VNFs might be of the same type, or to match a production
   environment there could be VNFs of multiple types and categories.  In
   this figure, VNFs #3-#5 are assumed to require small CPU resources,
   while VNF#2 requires 4 cores to perform its function.

4.5.  Power Consumption

   Although there is incomplete work to benchmark physical network
   function power consumption in a meaningful way, the desire to measure
   the physical infrastructure supporting the virtual functions only
   adds to the need.  Both maximum power consumption and dynamic power
   consumption (with varying load?) would be useful.

   >>> ADD REC from Dallas meeting...




Morton                  Expires December 2, 2015               [Page 11]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


5.  Security Considerations

   Benchmarking activities as described in this memo are limited to
   technology characterization of a Device Under Test/System Under Test
   (DUT/SUT) using controlled stimuli in a laboratory environment, with
   dedicated address space and the constraints specified in the sections
   above.

   The benchmarking network topology will be an independent test setup
   and MUST NOT be connected to devices that may forward the test
   traffic into a production network, or misroute traffic to the test
   management network.

   Further, benchmarking is performed on a "black-box" basis, relying
   solely on measurements observable external to the DUT/SUT.

   Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
   benchmarking purposes.  Any implications for network security arising
   from the DUT/SUT SHOULD be identical in the lab and in production
   networks.

6.  IANA Considerations

   No IANA Action is requested at this time.

7.  Acknowledgements

   The author acknowledges an encouraging conversation on this topic
   with Mukhtiar Shaikh and Ramki Krishnan in November 2013.  Bhavani
   Parise and Ilya Varlashkin have provided useful suggestions to expand
   these considerations.  Bhuvaneswaran Vengainathan has already tried
   the 3x3 matrix with SDN controller draft, and contributed to many
   discussions.  Scott Bradner quickly pointed out shared resource
   dependencies in an early vSwitch measurement proposal, and the topic
   was included here as a key consideration.  Further development was
   encouraged by Barry Constantine's comments following the IETF-92 BMWG
   session: the session itself was an affirmation for this memo with
   many interested inputs from Scott, Ramki, Barry, Bhuvan, Jacob Rapp,
   and others.

8.  References

8.1.  Normative References

   [NFV.PER001]
              "Network Function Virtualization: Performance and
              Portability Best Practices", Group Specification ETSI GS
              NFV-PER 001 V1.1.1 (2014-06), June 2014.



Morton                  Expires December 2, 2015               [Page 12]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
              "Framework for IP Performance Metrics", RFC 2330, May
              1998.

   [RFC2544]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
              Network Interconnect Devices", RFC 2544, March 1999.

   [RFC2679]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
              Delay Metric for IPPM", RFC 2679, September 1999.

   [RFC2680]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
              Packet Loss Metric for IPPM", RFC 2680, September 1999.

   [RFC2681]  Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip
              Delay Metric for IPPM", RFC 2681, September 1999.

   [RFC3393]  Demichelis, C. and P. Chimento, "IP Packet Delay Variation
              Metric for IP Performance Metrics (IPPM)", RFC 3393,
              November 2002.

   [RFC3432]  Raisanen, V., Grotefeld, G., and A. Morton, "Network
              performance measurement with periodic streams", RFC 3432,
              November 2002.

   [RFC4689]  Poretsky, S., Perser, J., Erramilli, S., and S. Khurana,
              "Terminology for Benchmarking Network-layer Traffic
              Control Mechanisms", RFC 4689, October 2006.

   [RFC4737]  Morton, A., Ciavattone, L., Ramachandran, G., Shalunov,
              S., and J. Perser, "Packet Reordering Metrics", RFC 4737,
              November 2006.

   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
              Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
              RFC 5357, October 2008.

   [RFC5905]  Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network
              Time Protocol Version 4: Protocol and Algorithms
              Specification", RFC 5905, June 2010.

   [RFC7498]  Quinn, P. and T. Nadeau, "Problem Statement for Service
              Function Chaining", RFC 7498, April 2015.






Morton                  Expires December 2, 2015               [Page 13]

Internet-Draft     Benchmarking VNFs and Related Inf.           May 2015


8.2.  Informative References

   [RFC1242]  Bradner, S., "Benchmarking terminology for network
              interconnection devices", RFC 1242, July 1991.

   [RFC5481]  Morton, A. and B. Claise, "Packet Delay Variation
              Applicability Statement", RFC 5481, March 2009.

   [RFC6049]  Morton, A. and E. Stephan, "Spatial Composition of
              Metrics", RFC 6049, January 2011.

   [RFC6248]  Morton, A., "RFC 4148 and the IP Performance Metrics
              (IPPM) Registry of Metrics Are Obsolete", RFC 6248, April
              2011.

   [RFC6390]  Clark, A. and B. Claise, "Guidelines for Considering New
              Performance Metric Development", BCP 170, RFC 6390,
              October 2011.

Author's Address

   Al Morton
   AT&T Labs
   200 Laurel Avenue South
   Middletown,, NJ  07748
   USA

   Phone: +1 732 420 1571
   Fax:   +1 732 368 1192
   Email: acmorton <at> att.com
   URI:   http://home.comcast.net/~acmacm/





















Morton                  Expires December 2, 2015               [Page 14]
_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg

Gmane