Topics

MoM of Today's OCP SONiC Call 08/20/2019

Aviz Devteam <developers@...>
 

  • MoM of today's OCP SONiC call  8/20/2019.

Topics discussed
  • MC-LAG HLD - Nephos 
Review (Q & A):
  • Can MC-LAG support on sub-port interfaces?
  • Update scope of L2/L3 MC-LAG in HLD. 
  • Can MCLAG supports multicast? 
  • Do you have scale numbers w.r.t FDB/ARP/Route sync between MC-LAG failures? 
  • How can we isolate the packet flooding between MCLAG vs NON-MCLAG in same broadcast domain? 
  • Update HLD with test cases for MC-LAG failover (link/node level) scenarios?

Thanks,
AvizNetworks


MoM of today's OCP SONiC call  8/13/2019.

Topics discussed
  • Sonic management framework - BRCM & DELL
Review (Q & A)
  • Can the click cli co-exists with mgmt-framework ? Yes.
  • Does mgmt framework support existing click cli commands ? yes, click based cli commnads will be migrated to klish based cli.
  • Can the click based cli deprecated ? No
  • Can the mgmt-framework supports the external AAA servers for authentication? pl add details to the HLD.
  • Add AAA auth failure work flow the REST SET work flow?
  • Does the mgmt framework handles the end to end error handling or feedback loop ? No, out of the scope.
  • Why are pulling telemetry container into mgmt container? We don't run multiple gNMI servers in SONIC, and requesting community to rename the sonic-telemetry server and make part of mgmt-framework.
  • Does output of click based CLI will be changed? 
  • Does the mgmt-framework supports the notion of start up config ? 
  • Does the mgmt framework supports the CLI show to reflect the configDB?
  • Can the mgmt-framework supports show running config ?
  • Feature timelines - the scope is proposing the mgmt framework and there will be seperate feature HLDs coming. 


MoM of today's OCP SONiC call  8/6/2019.

Topics discussed
  • Sub port interface design - Winda
Review (Q & A)
  • How sub-port interface different from vlan interface in sonic? Ans: Vlan interface is a bridge port in sonic.
  • Rename dot1Q table ? - Since there is vlan interface table, dot1Q interface table is little confusing, community suggested go with sub-port/interface table.
  • How about separate sub-interface/port manager for sub-port interfaces?
  • Does sub-port feature use sonic-cli/direct native calls ? It uses linux iproute2 calls 
  • Do you expect iproute2 upgrades to support sub-port feature? No 
  • What is the use case of mtu with sub interface? 
  • Can sub-port interface support on port-breakout interfaces? 
  • Do you see any issues with naming convention w.r.t port breakouts & sub-ports?
  • Is there any limit on sub port interfaces? yes, refer scalability section [750 per switch]
  • Few question on sub-port functionality? If the packet entered untagged how does it route to sub-port interface?
  • what is the miss-policy support with sub-port interfaces ? could be dropped - debatable 
  • define behavior untagged and miss policy arrived to physical port? How Sonic process these packets?
  • Can physical & sub port interfaces shared same neighbor table or different ? 
  • Add section to the HLD for cross functional / port properties when port is layer 3/ layer 2 port? 


Announcements:
  • 201908 release - will be delayed 10/2019
  • please send out PR's to sonic mailing lists 
  • OCP Amsterdam [Europe]- End of Sept.

MoM of today's OCP SONiC call  07/23/2019.

Topics discussed
  • Debug framework design spec - BRCM
Review (Q & A)
  • What is the impact on current show tech dump ? 
  • Can the framework support get the tech dump specified time slice/range ? 
  • Does framework support any schema for debug event triggers ? 
  • Where does this framework run, can user turn off? 
  • Will the framework exports debug data in Json format? 

MoM of today's OCP SONiC call  07/16/2019.

Topics discussed
  • Egress Mirror support and ACL action capability check 
Review (Q & A)
  • Does this feature backward compatible? Yes [sonic - to -sonic ]
  • Is there any requirement for egress mirroring to have all packet modifications done in the mirrored copy? No such support.
  • What is the behavior if max egress sessions programmed? - Not a requirement 
  • If both ingress/egress enabled on same packet, do we see two mirror copies? Yes, might need a fix around it.
  • Does SONiC has any limit on supporting egress mirror sessions? - depends on ASIC limit
  • Does this design supports truncate the mirrored copy ? Does it a SONiC/SAI spec? Need to check 

  • SONiC Image Build Time Improvements (MLNX)
Review (Q & A)

  • Is the design use parallel builds? yes, make use of all the cpu threads (12) 
  • How much build time improvements we can see if we discount kernel? - ~1 h (we build linux built in separate thread)
  • How is different Docker build kit from docker natived?- DBK is completely written for docker images and supports isolated users instead multiple users.

    Announcements
    • 201908 release tracking
    • Repurposing the sub-group meetings to design meetings.

    MoM of today's OCP SONiC call  07/09/2019.

    Topics discussed

    • PDE (Platform Development Environment) /PDDF (Platform Driver Development Framework)- BRCM
    Review (Q&A)
    • Is PDE specific to BRCM chipset? Not necessarily, who ever supoport SAI can use it.
    • What are the interfaces PDE provides for ASIC and platform? PDDF data driven framework (JSON APIs)& existing driver API's
    • Can framework allow vendor extensions ? PDDF supports vendor extensions
    • How to package PDE ?  PDE can be built along with full sonic image & dockers or individual docker
    • Will custom plugins (ex:BMC) could integrate with PDE? yes
    • Can we load PDE into multiple targets? possible 

    Announcements
    • PR reviews ownership - checkout the 201908 release tracking page

    MoM of today's OCP SONiC call  06/25/2019.

    Topics discussed

    • VRF design discussion  - Nephos (Jeffrey) 
    Review (Q&A)
    • How does VRF configures in Linux kernel? As of now, though there is a CLI wrapper, SONiC ultimately uses the linux NetLink calls. [Community has some suggestions - Liat may help here with our examples]
    • Questions on config_db migration script on VRF config migration? offline discussions would continue/PR feedback.
    • Design decision behind creating an empty interface INTERFACE|Ethernet0:{} in config_db ? Multiple things, 1) SAI 2) Code complexity behind the resource migration. etc. There is a section in the PR,  feedback can be provided.
    • There is a request on VRF ID adding besides interface name in the next hop? The decision seems we are going with minimal configuration to support the SONiC system design.
    • Can we safely assume VRF design supports later versions of Linux Kernel 4.9? Yes. 
    What next? 
    • PR discussion could be extended to next meeting based on the PR feedback. [Jeffery/Prince]

    MoM of today's OCP SONiC call  06/18/2019.

    Topics discussed

    • Error Handling  - BRCM (Santhosh)
    Review (Q&A)
    We had a great discussion, there are lot of inputs from community and here is some. Feel free to add missing comments here.
    • How does framework supports multiple CRUD failures?  
    [Ben]: See below 
    • Do you provide a knob to switch off Error handling feature? Is knob necessary? 
    [Ben]: No knob is necessary. The error handling proposal is a framework that is available for a) implementation of error reporting in SWSS on a feature-by-feature basis and b) application processing of such errors. Both a) and b) are implementation choices that can be made on an feature-by-feature basis. And if an application does not want to process a supported error, then it can just ignore it. 
    • Does the applications get out of order notifications from feedback loop? How to handle in the case of it? Ex: User does create/delete/create and do you expect the error feedback come in order? 
    [Ben]: The specific comment was that the key/values used to refer to APP_DB (or other) in an ERROR_DB report may not be specific enough to distinguish between different error events. The example given (by Nikos) was a route add-withdraw-add case - since the APP_DB table entry may be the same between the 2 adds, then, if there's an error report, how does the application (FRR in this case) know which of the adds failed? We will come back on this point. 
    • What is the design decision behind a new Error DB? Why can't we merge error attributes into APP DB? 
    [Ben]: We thought about both options, and decided that the ERROR_DB gave a bit more flexibility and avoided changing existing application tables. It was not a clear decision, but we see no reason to move away from it. 
    • What is the mechanism to synchronize route CRUD between APP DB vs new Error DB? 
    [Ben]: See above 
    • Is new Error DB is a mirror of APP DB? 
    [Ben]: Not really - but each error table entry points to a corresponding entry in another table (usually APP_DB) 
    • The current design mentioned an approach to stop propagate the failed/error routes to the neighbors? This may not right as per RFC, the routes should propagate though the it failed due to some policy. (Nikos)
    [Ben]: This topic went beyond scope of the framework (#1 above) and into the BGP doc (#2). We will setup a separate offline discussion for this.
     
    Overall feedback - The feedback loop is necessary to address SAI fatal errors. However the community requested the design should dis associate/de couple the feedback loop  as much as possible so that applications have freedom to react/handle it own way.
    [Ben]: That's exactly how it's setup today. 
    one option suggested - Framework should more generic and should accommodate opaque error context for the applications. 
    [Ben]: This is a different topic - see above ("The specific comment was that the key/values ....")

    Xin will extend an offline discussion on this topic, stay tuned.


    Announcements 
    • SONiC Release 201908 tracking page - Xin can you post the link
    • Action Item for community - Signup for PR reviews

    MoM of today's OCP SONiC call  06/04/2019.

    Topics discussed
    • STP/PVST - Sandeep (BRCM)
    Q & A 
    • Can this STP feature compile time disabled? BRCM will explore this (compile time/run time options to disable/enable STP/PVST feature)
    • Warm reboot not supported for PVST? Community requested more details need to be added to design 
    • Multiple questions what is the design decision on why  STP states are not programming to Kernel?   Few questions: 1) With the current STP design - the STP states are not populating in kernel, ASIC and Kernel will be out of sync, what is the downside ?  2) Let's say Port/Vlan is not blocking in the kernel, but is blocked in ASIC, then what is the behavior with arp/ping/ospf in this scenarios ?  BRCM should document the scenarios.
    • Community requested to document the ASIC and Kernel out of sync scenarios - AI BRCM
    • There should be no drop if HW says forwarding? yes
    • Is there mechanism to program the states in to Kernel ? BRCM to explore on it
    • If the trap is configured on port which is blocked does the packet comes to CPU? yes, based on the trap configurations.
    • When port is blocked in HW, what are the packets should send? - HW shouldn't block L2 packets/LACP exchanges but drop L3 packets.
    • Can COPP program to trap to cpu ? Yes

    • HLD on NAT  - Kiran Kella (BRCM)

    Q & A 
    • Does it support payload/embedded headers (ALGs- application level gateways) support ? Not right now.
    • Continue discussion next sub group meeting. 
    Announcements 
    • Next sub group meetings HLD on NAT, SFLow 

    MoM of today's OCP SONiC SUB GROUP call  05/28/2019.

    Topics discussed:
    • Status on MLAG Design discussions - Nephos Team

    Q & A 
    • Does this solution addressed L3 MLAG alone? Both L3 and L2. It seems L2 MLAG HLD need some updates.
    • Does MCLAG supports MulitCast? Nephos team will update the HLD with all the use-cases and missing pieces.
    • When is the next meeting to discuss on MCLAG ? June 11th
    • Community requested Nephos team for Updated MCLAG HLD before Jun 11th. 

    Action Items/Announcements
    • Will it be possible to discuss other than MCLAG in SUB Group calls ? Yes. Xin we will work and adjust to the cadence
    • Community requested to include/Update User Scenarios in HLDs for review
    • Ben Gale (BRCM) will propose on MCLAG next few weeks. 
    • Request community to review below MCLAG PR before next sub group meeting (06/11/2019)
    • Here is the PR and design presentation
      1.  MCLAG video - https://www.youtube.com/watch?v=shFEKjBp66Q&feature=youtu.be
      2.  MCLAG PR - https://github.com/Azure/SONiC/pull/325

    MoM of today's OCP SONiC call  05/21/2019.

    Topics discussed
    • L2 - FDB/MAC enhancements - Anil (Broadcom)

    Q & A 
    • FDB aging per device ? yes 
    • Does FDB aging support per sec ? yes 
    • Can MAC aging support per port and VLAN ? Anil will add support to the proposal 
    • Design on restrict the warning logs on VLAN range feature support? Broadcom will consider this in the proposal [Aggregated log etc.]
    • Does this feature need  SAI support from vendors ? (no new SAI attributes), Broadcom will list SAI APIs using it currently for this feature.
    • How does Vlan range updates implemented? vlan range being consolidated at config_db and apply down to the hardware in single shot, no deletes and adds.
    • Do we have FDB type in the fdb entry ? yes [static vs dynamic] and will be displayed in show commands
    • How does FDB optimizations on topo/STP event flush ? out side of ASIC, in the case of broadcom flushes are quick.  
    • How does system wide fdb flush ? It should handled by SAI, by go over all the ports and Vlans, vendor specific. 
    • Community ask on MAC aging & MAC move scale numbers? Broadcom will add into the proposal 

    • BFD - Sumit Agarwal (Broadcom)
    Q & A 

    • Discussed on BFD implementations phase 1  & Phase 2. 
    • In BFD Phase-1 : BFD is part of BGP docker
    • In BFD Phase 2 : BFD will implement in Hardware. 
    • Can SONiC Users turn off if they don't want? yes through compile time, but community suggested don't run default, provide CLI to enable it.
    • How BFD works with warm reboots ? 1) planned warm reboot, users can update the BFD timers upfront 2) unplanned warm reboot BFD session will timeout before BGP timeouts. 
    • Can configure/control BFD timeouts on remote Bgp peers? Question from Nikos. Need discussion more.
    Announcements 

    • More design reviews lineup for Aug 2019.
    • Provide feedbacks on PRs 
    • Watch out for bi weekly meeting on design proposals and reviews.
    MoM of today's OCP SONiC call  05/07/2019.

    Topics discussed
    • SONiC 201908 release Planning - 05/07/2019

    Q & A 
    • Need code review support for multi-db performance improvements - MSFT & AVIZ Networks
    • What is the scope of Error handling mechanism work by BRCM  - It covers SAI error surfacing and handling
    • What is the scope of Configuration validations - Open for design, current scope is use syslog mechanism to propagate the config errors.
    • What is the VRF feature planned in SONiC? it is VRF lite support not the MPLS. 
    • Do we have plan for multi-tenancy VPN with VRF feature? No, that would be handles separately.
    • When is the VRF lite design review - Expected 5/21
    • What is the ETA for dynamic breakout - Xin will work with LNKD
    • For dynamic breakout, is it possible to get ASIC vendor ETA ? Xin will talk to ASIC vendors [an ETA early July would help to test it]
    • Do we have a list of platform APIs ? refer PMON APIs
    • How to earn OCP credits for companies - Checkout the OCP website for how to get credits to such as software contributions etc.
    • Is sub-port feature is same as sub-interface ? yes 
    • What kind of features run on sub-port? No HLD yet, Jipan will come back with HLD on this
    • Can we have small description on sub-port ? Xin will work with Alibaba
    • When is the SAI proposal on sFlow? Dell working on the SAI proposal for sFlow and will send for design review.
    • What does SONiC side use for slow ? HSflowD, its a opensource package and need to check the licensing [Need to explore the licensing part, work with Xin]
    • Build improvements - experimental BRCM ? design review needed on the changes. Ben will provide a design review
    • What is Mgmt framework - Goal is to easily manage the sonic switch? [models, serialization, unified cli, gnmi]
    • What is the BFD for FRR used for - for BGP failures
    • Does BFD-FRR required SAI support ? No, for the current work, not using any SAI BFD APIs, will be using on next iteration.
    • Does SONiC official release support on ONL ? No, SONiC has tight roadmap next 8 months.

    Announcements 
    • OCP events - www.opencompute.org/events/upcoming events - road show  Taiwan, Beijing, India
    • SONiC next meeting 05/21/2019 
    • SONiC team will use Workgroup meetings other alternative Tuesday [Test workgroups & MLAG/L2 workgroups etc. ]
    APR release 
    • Redis performance - out of the apr release
    • CLI improvement - moved to next release
    • Any ETA for APR release stabilizations - need to estimate