Maybe News - Issues

SP #1

2023-11-24T20:50:00.000Z

这是 Maybe News 特别篇的第二期，灵感来自我最近看的一系列演讲。如果你想回顾第一期特别篇，请点击这里查看。

Paul Graham: Before the Startup

「Before the Startup」是硅谷著名投资机构 Y Combinator（简称 YC）创始人、《黑客与画家》作者 Paul Graham 2014 年在斯坦福大学的一个演讲，事实上这是「How to Start a Startup」系列课程中的其中一节课，而这个系列课程的发起人正是近期备受瞩目的 OpenAI 创始人 Sam Altman。除了视频以外，你也能在 Paul Graham 的个人网站上看到文字版，但是我建议你两个都看看（演讲时长只有 30 分钟）。视频里包含的是最原始的内容，除了演讲主题以外还穿插了不少 Graham 和现场听众的互动、一些不适合放到文字版中的「玩笑」，以及最后长达 18 分钟的 Q&A。而文字版相比视频多了一些 Graham 后期加上的脚注，以及链接到 Graham 以前写的一些文章的引用（可以作为延伸阅读）。

相比「How to Start a Startup」的其它课程，Graham 的这个演讲没有任何 slide，30 分钟的演讲全程都靠他口述（也基本没有看演讲稿）。对于之前只看过 Graham 写的文字的我来说，观看这个视频进一步加深了 Graham 在我心目中的个人魅力。Graham 的这个演讲主题在整个系列中其实算是相对抽象的内容，并不涉及太多可以直接拿来用的创业「最佳实践」或是「奇技淫巧」，有的只是 Graham 多年从事 VC 总结的一些「经验」，中心思想可以总结为一句话——初创公司是非常违反直觉的（Startups are very counterintuitive）。

围绕这个中心思想 Graham 列举了 6 个反直觉的观点：

如果你跟着直觉走，它们就会把你引入歧途；
了解很多关于初创公司的信息并不那么重要；
通过玩弄系统（gaming the system）来获得成功在初创公司中并不奏效；
初创公司需要全力以赴；
创业是否成功很难预测；
获得创业想法的方法不是尝试思考创业想法。

以上每个观点 Graham 都做了详尽解释和举例说明，这里我就不再一一转述。

在 30 分钟的演讲结束以后，有一个长达 18 分钟的 Q&A 环节。由于 Graham 个人网站的文字版中并不包含 Q&A 的内容，于是我整理了几个我觉得有意思的问题：

如果我对创业感兴趣，商学院还有价值吗？
- 基本上没有。商学院的目的是教人如何管理，而管理是一个只有在初创公司取得足够成功时才会遇到的问题。要想让初创公司取得成功，你需要尽早了解的是如何开发一个好的产品。
- 我早期做错的一件事是，我建议那些有意创办初创公司的人先去其它公司工作几年，然后再创办自己的公司。但老实说，学习如何创办初创公司的最好方法就是尝试创办一家初创公司。
你的演讲中提到只有成功了以后管理才是个问题，那对于最初加入的两三个人是否会遇到管理问题呢？
- 初创公司的第一批员工几乎就像创始人一样。他们应该受到同样的激励，他们不能是你必须管理的人，你不应该过多地管理他们。
- 一般来说，你需要那些在早期就能自我激励的人。他们应该像创始人一样。
对于正在寻求资金的女性联合创始人，你有什么建议？
- 让你的初创公司做得足够好。如果从风险投资人的角度来看，你在任何方面都没有达到理想目标，那么解决问题的方法就是让初创公司真正做好。
- 事实上一两年前，我曾在 Twitter 上发布过一家公司的增长图，但我没有说她们是谁。这其实是一家由女性创办的初创公司，她们在融资方面遇到了困难，但她们的增长却非常惊人。于是我在 Twitter 上发布了这条消息，因为我知道所有的风险投资人都会开始问我这是谁。增长图表没有性别之分。
在你的工作和个人生活中，有哪些方式或者技巧能让你变得高效？
- ~~生孩子是提高效率的好办法。因为你没有时间了，所以如果你想完成任何事情，每次完成的量就会很大。这会让你集中精力，因为你别无选择。~~（这只是个玩笑）
- 我不认为我是一个效率很高的人，我有两种完成工作的方法：
  - 方法 1，被迫工作。我在 YC 的工作方式是被迫的，我不得不设定申请截止日期。然后人们就会提出申请，我必须在某个时间之前做出回复，所以我必须阅读这些申请。我知道如果我读得不好，就会投资到不好的初创公司，所以我必须非常努力地读好它们。
  - 方法 2，做你感兴趣的事情，对我来说就是写文章。我会不由自主地想要写文章，就像走在大街上时，文章就开始在我脑子里诞生。
- 我要么强迫自己去做不那么令人兴奋的事情，要么就情不自禁地去做令人兴奋的事情，我没有任何有用的技巧来让自己变得高效。如果你从事自己喜欢的工作，就不必强迫自己提高效率。
该如何确定什么是重要的或者真正有趣的事情？
- 我想出了一种检测你是否喜欢真正有趣问题的方法，也就是你是否觉得做无聊的事情难以忍受。有一些众所周知的枯燥乏味的事情，比如文学理论和在某个大公司的中层管理部门工作。因此，如果你能忍受这些事情，那么你一定是有超强的自律能力，或者你不喜欢真正有趣的问题，反之亦然。
如果你总是雇用自己喜欢的人，你可能会得到一个单一文化（monoculture）的公司，该如何处理可能带来的盲点？
- 创办一家初创公司很多事情都会出错，你不能指望它十全十美，而雇用你熟悉和喜欢的人的优势远远大于所谓单一文化的小缺点。从经验上看，所有最成功的初创公司都会雇用大学毕业的伙伴。

在「How to Start a Startup」网站的推荐阅读里，Graham 的这节课有两个推荐项，一个是 Graham 2012 年写的文章「How to Get Startup Ideas」，另一个是 Steve Jobs 1995 年的一段采访视频。整个采访接近 1 个半小时，但是从 1 小时 10 分开始，记者问了 Steve Jobs 一些关于如何看待初创公司和大公司之间关系的问题，本期 Maybe News SP 将以 Steve Jobs 的这段回答作为结束：

Technology keeps on advancing so there are opportunities.
…
Human minds settle into fixed ways of looking at the world and that’s always been true and it’s probably always going to be true.
…
One of the things that happens in organizations as well as in people is they settle into ways of looking at the world and become satisfied with those, and the world changes and keeps evolving and new potential arises, but these people that are settled in don’t see it. That’s what gives startup companies their greatest advantage is the sedentary point of view of large companies.

Issue #13

2022-06-07T19:00:00.000Z

本期关键词：InfiniFS、Decentralized Social Networks、杨海崧

Maybe News - Issues

SP #1

Paul Graham: Before the Startup​

Issue #13

InfiniFS: An Efficient Metadata Service for Large-Scale Distributed Filesystems​

Decentralized Social Networks​

鹈鹕 Hits 043 | 回顾兵马司，杨海崧觉得最可惜的是什么？​

SP #0

我们向封控区的人们收集了解封后的 100 个愿望​

延伸阅读、聆听、观看​

Issue #12

Paxos Made Live - An Engineering Perspective​

Paxos 理论介绍​

Paxosmon: Gotta Consensus Them All​

Announcing InfluxDB IOx: The Future of InfluxDB Built with Rust & Arrow​

MEGA DRIVE-黑色的 16bit 传奇 Vol.3 16bit 音乐探秘​

Issue #11

Procella: Unifying serving and analytical data at YouTube​

Presto at Facebook: State of the Union​

Origin and History of Apache Arrow​

Diving Deep on S3 Consistency​

A State of Feast​

Uptown Records 十年，不只是一家唱片店​

Issue #10

Facebook’s Tectonic Filesystem: Efficiency from Exascale​

How to do distributed locking​

A More Flexible Paxos​

昨天的黎忘年和明天的盐​

Issue #9

Virtual Consensus in Delos​

Virtualizing Consensus in Delos for Rapid Upgrades and Happy Engineers​

RFC: Elastic Horovod​

Scaling Kubernetes to 7,500 Nodes​

播客文字回顾 | Late Troubles 是陈曦的“中年朋克辛酸”吗？​

Issue #8

The Google File System​

Colossus: Successor to the Google File System​

Storage Reimagined for a Streaming World​

Why We Built lakeFS: Atomic and versioned Data Lake Operations​

Magnet: A scalable and performant shuffle architecture for Apache Spark​

支付宝研究员王益：Go+ 可有效补全 Python 的不足​

Issue #7

Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores​

Delta Engine: High Performance Query Engine for Delta Lake​

A Thorough Comparison of Delta Lake, Iceberg and Hudi​

Bringing HPC Techniques to Deep Learning​

Introducing TensorFlow Recommenders​

Issue #6

Presto: SQL on Everything​

Spark Architecture: Shuffle​

Cosco: An Efficient Facebook-Scale Shuffle Service​

Federated Learning: Collaborative Machine Learning without Centralized Training Data​

分布式文件系统架构对比​

Issue #5

DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications​

The Next Step for Generics​

Fiber: Distributed Computing for AI Made Simple​

The impact of slow NFS on data systems​

Kubeflow & Kale simplify building better ML Pipelines with automatic hyperparameter tuning​

GoogleCloudPlatform/spark-on-k8s-operator #976: Add support for dynamic allocation via shuffle tracking​

Boiled Hippo​

Issue #4

AliGraph: A Comprehensive Graph Neural Network Platform​

Building Uber’s Go Monorepo with Bazel​

Optimising Docker Layers for Better Caching with Nix​

Proposal: Permit embedding of interfaces with overlapping method sets​

VexTab​

Issue #3

Kudu: Storage for Fast Analytics on Fast Data​

字节跳动自研强一致在线 KV & 表格存储实践​

Challenges Supporting MIG in Kubernetes​

How to read deep learning papers?​

Farewell, TensorFlow​

Issue #2

In Search of an Understandable Consensus Algorithm (Extended Version)​

Scaling Raft​

Why Generics?​

The Open Application Model from Alibaba’s Perspective​

Lightweight coscheduling based on back-to-back queue sorting​

Scheduler Support for Elastic Quota Management​

Paul Graham: Before the Startup

InfiniFS: An Efficient Metadata Service for Large-Scale Distributed Filesystems

Decentralized Social Networks

鹈鹕 Hits 043 | 回顾兵马司，杨海崧觉得最可惜的是什么？

我们向封控区的人们收集了解封后的 100 个愿望

延伸阅读、聆听、观看

Paxos Made Live - An Engineering Perspective

Paxos 理论介绍

Paxosmon: Gotta Consensus Them All

Announcing InfluxDB IOx: The Future of InfluxDB Built with Rust & Arrow

MEGA DRIVE-黑色的 16bit 传奇 Vol.3 16bit 音乐探秘

Procella: Unifying serving and analytical data at YouTube

Presto at Facebook: State of the Union

Origin and History of Apache Arrow

Diving Deep on S3 Consistency

A State of Feast

Uptown Records 十年，不只是一家唱片店

Facebook’s Tectonic Filesystem: Efficiency from Exascale

How to do distributed locking

A More Flexible Paxos

昨天的黎忘年和明天的盐

Virtual Consensus in Delos

Virtualizing Consensus in Delos for Rapid Upgrades and Happy Engineers

RFC: Elastic Horovod

Scaling Kubernetes to 7,500 Nodes

播客文字回顾 | Late Troubles 是陈曦的“中年朋克辛酸”吗？

The Google File System

Colossus: Successor to the Google File System

Storage Reimagined for a Streaming World

Why We Built lakeFS: Atomic and versioned Data Lake Operations

Magnet: A scalable and performant shuffle architecture for Apache Spark

支付宝研究员王益：Go+ 可有效补全 Python 的不足

Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores

Delta Engine: High Performance Query Engine for Delta Lake

A Thorough Comparison of Delta Lake, Iceberg and Hudi

Bringing HPC Techniques to Deep Learning

Introducing TensorFlow Recommenders

Presto: SQL on Everything

Spark Architecture: Shuffle

Cosco: An Efficient Facebook-Scale Shuffle Service

Federated Learning: Collaborative Machine Learning without Centralized Training Data

分布式文件系统架构对比

DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications

The Next Step for Generics

Fiber: Distributed Computing for AI Made Simple

The impact of slow NFS on data systems

Kubeflow & Kale simplify building better ML Pipelines with automatic hyperparameter tuning

GoogleCloudPlatform/spark-on-k8s-operator #976: Add support for dynamic allocation via shuffle tracking

Boiled Hippo

AliGraph: A Comprehensive Graph Neural Network Platform

Building Uber’s Go Monorepo with Bazel

Optimising Docker Layers for Better Caching with Nix

Proposal: Permit embedding of interfaces with overlapping method sets

VexTab

Kudu: Storage for Fast Analytics on Fast Data

字节跳动自研强一致在线 KV & 表格存储实践

Challenges Supporting MIG in Kubernetes

How to read deep learning papers?

Farewell, TensorFlow

In Search of an Understandable Consensus Algorithm (Extended Version)

Scaling Raft

Why Generics?

The Open Application Model from Alibaba’s Perspective

Lightweight coscheduling based on back-to-back queue sorting

Scheduler Support for Elastic Quota Management

CFS: A Distributed File System for Large Scale Container Platforms

tensorflow/community #237: RFC: Sparse Domain Isolation for Supporting large-scale Sparse Weights Training

深入云原生 AI：基于 Alluxio 数据缓存的大规模深度学习训练性能优化

Rob Pike interview: “Go has indeed become the language of cloud infrastructure”

孤芳「自赏」：盯鞋音乐的前世与今生

LightRec: a Memory and Search-Efficient Recommender System

TFRT: A new TensorFlow runtime

Why We Need DevOps for ML Data

Mid-stack inlining in Go

Why We Leverage Multi-tenancy in Uber’s Microservice Architecture